Revert "don't need this"

This reverts commit 93121c18e0.
don't need this
2025-10-22 14:53:25 +11:00 · 2025-10-22 14:30:13 +11:00 · 2025-10-22 14:27:17 +11:00 · 2025-10-22 14:19:00 +11:00 · 2025-10-21 16:00:58 +11:00 · 2025-10-21 14:34:41 +11:00
58 changed files with 15949 additions and 1136 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -6,6 +6,8 @@ target
 # These are backup files generated by rustfmt
 **/*.rs.bk

+**/.DS_Store
+
 # MSVC Windows builds of rustc generate these, which store debugging information
 *.pdb

@@ -19,3 +21,7 @@ target
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+
+# Session logs directory
+logs/
+*.json
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,33 @@
+# Changelog
+
+## [Unreleased]
+
+### Added
+
+**Interactive Requirements Mode**
+- **AI-Enhanced Interactive Requirements**: New `--interactive-requirements` flag for autonomous mode
+  - User enters brief description of what they want to build
+  - AI automatically enhances input into structured requirements.md document
+  - Generates professional markdown with:
+    - Project title and overview
+    - Organized requirements (functional, technical, quality)
+    - Acceptance criteria
+  - User can review, accept, edit manually, or cancel before proceeding
+  - Seamlessly transitions to autonomous mode
+
+**Autonomous Mode Configuration**
+- **Autonomous Mode Configuration**: Added ability to specify different models for coach and player agents in autonomous mode
+  - New `[autonomous]` configuration section in `g3.toml`
+  - `coach_provider` and `coach_model` options for coach agent
+  - `player_provider` and `player_model` options for player agent
+  - `Config::for_coach()` and `Config::for_player()` methods to generate role-specific configurations
+  - Comprehensive test suite for autonomous configuration
+
+### Changed
+- Autonomous mode now uses `config.for_player()` for the player agent
+- Coach agent creation now uses `config.for_coach()` for the coach agent
+
+### Benefits
+- **Cost Optimization**: Use cheaper models for execution, expensive models for review
+- **Speed Optimization**: Use faster models for iteration, thorough models for validation
+- **Specialization**: Leverage different providers' strengths for different roles
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -4,7 +4,8 @@ members = [
    "crates/g3-core", 
    "crates/g3-providers",
    "crates/g3-config",
-    "crates/g3-execution"
+    "crates/g3-execution",
+    "crates/g3-computer-control"
 ]
 resolver = "2"

--- a/DESIGN.md
+++ b/DESIGN.md
@@ -1,157 +1,316 @@
-# G3 General Purpose AI Agent - Design Document
+# G3 - AI Coding Agent - Design Document

 ## Overview
-G3 is a **code-first AI agent** that helps you complete tasks by writing and executing code or scripts. Instead of just giving advice, G3 solves problems by generating executable code in the appropriate language.
+
+G3 is a **modular, composable AI coding agent** built in Rust that helps you complete tasks by writing and executing code. It provides a flexible architecture for interacting with various Large Language Model (LLM) providers while offering powerful code generation, file manipulation, and task automation capabilities.
+
+The agent follows a **tool-first philosophy**: instead of just providing advice, G3 actively uses tools to read files, write code, execute commands, and complete tasks autonomously.

 ## Core Principles
-1. **Code-First Philosophy**: Always try to solve problems with executable code
-2. **Multi-Language Support**: Generate scripts in Python, Bash, JavaScript, Rust, etc.
-3. **Unix Philosophy**: Small, focused tools that do one thing well
+
+1. **Tool-First Philosophy**: Solve problems by actively using tools rather than just providing advice
+2. **Modular Architecture**: Clear separation of concerns across multiple Rust crates
+3. **Provider Flexibility**: Support multiple LLM providers through a unified interface
 4. **Modularity**: Clear separation of concerns
 5. **Composability**: Components can be combined in different ways
-6. **Performance**: Blazing fast execution
+6. **Performance**: Built in Rust for speed and reliability
+7. **Context Intelligence**: Smart context window management with auto-summarization
+8. **Error Resilience**: Robust error handling with automatic retry logic

-## Architecture
+## Project Structure

-### High-Level Components
+G3 is organized as a Rust workspace with the following crates:
+
+```
+g3/
+├── src/main.rs                   # Main entry point (delegates to g3-cli)
+├── crates/
+│   ├── g3-cli/                   # Command-line interface, TUI, and retro mode
+│   ├── g3-core/                  # Core agent engine, tools, and streaming logic
+│   ├── g3-providers/             # LLM provider abstractions and implementations
+│   ├── g3-config/                # Configuration management
+│   ├── g3-execution/             # Code execution engine
+│   └── g3-computer-control/      # Computer control and automation
+├── logs/                         # Session logs (auto-created)
+├── README.md                     # Project documentation
+└── DESIGN.md                     # This design document
+```
+
+## Architecture Overview
+
+### High-Level Architecture

 ```
 ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
-│   CLI Module    │    │  Core Engine    │    │ LLM Providers   │
+│   g3-cli        │    │   g3-core       │    │ g3-providers    │
 │                 │    │                 │    │                 │
-│ - Task commands │◄──►│ - Task          │◄──►│ - OpenAI        │
-│ - Interactive   │    │   interpretation│    │ - Anthropic     │
-│   mode          │    │ - Code          │    │ - Embedded      │
-│ - Code exec     │    │   generation    │    │   (llama.cpp)   │
-│   approval      │    │ - Script        │    │ - Custom APIs   │
-│                 │    │   execution     │    │                 │
+│ • CLI parsing   │◄──►│ • Agent engine  │◄──►│ • Anthropic     │
+│ • Interactive   │    │ • Context mgmt  │    │ • Databricks    │
+│ • Retro TUI     │    │ • Tool system   │    │ • Embedded      │
+│ • Autonomous    │    │ • Streaming     │    │   (llama.cpp)   │
+│   mode          │    │ • Task exec     │    │ • OAuth flow    │
+│                 │    │ • TODO mgmt     │    │                 │
 └─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
-                    ┌─────────────────┐
-                    │   Execution     │
-                    │   Engine        │
-                    │                 │
-                    │ - Python        │
-                    │ - Bash/Shell    │
-                    │ - JavaScript    │
-                    │ - Rust          │
-                    │ - Sandboxing    │
+                    ┌─────────────────┐    ┌─────────────────┐
+                    │ g3-execution    │    │   g3-config     │
+                    │                 │    │                 │
+                    │ • Code exec     │    │ • TOML config   │
+                    │ • Shell cmds    │    │ • Env overrides │
+                    │ • Streaming     │    │ • Provider      │
+                    │ • Error hdlg    │    │   settings      │
+                    └─────────────────┘    │ • Computer      │
+                             │              │   control cfg   │
+                             │              └─────────────────┘
+                             │                       │
+                    ┌─────────────────┐             │
+                    │ g3-computer-    │◄────────────┘
+                    │   control       │
+                    │ • Mouse/kbd     │
+                    │ • Screenshots   │
+                    │ • OCR/Tesseract │
+                    │ • Windows/UI    │
                    └─────────────────┘
 ```

-### Module Breakdown
+## Core Components

-#### 1. CLI Module (`g3-cli`)
- **Responsibility**: User interface and task interpretation
- **New Features**:
-  - Progress indicators for script execution
+### 1. g3-core: Agent Engine

-#### 2. Core Engine (`g3-core`)
- **Responsibility**: Task interpretation and code generation
- **New Features**:
-  - Task analysis and decomposition
-  - Language selection based on task type
-  - Code generation with execution context
-  - Script template system
-  - Autonomous execution of generated code
+**Primary Responsibilities:**
+- Main orchestration logic for handling conversations and task execution
+- Context window management with intelligent token tracking
+- Built-in tool system for file operations and command execution
+- Streaming response parsing with real-time tool call detection
+- Error handling with automatic retry logic

-#### 3. LLM Providers (`g3-providers`)
- **Responsibility**: LLM communication and model abstraction
- **Supported Providers**:
-  - **OpenAI**: GPT-4, GPT-3.5-turbo via API
-  - **Anthropic**: Claude models via API  
-  - **Embedded**: Local open-weights models via llama.cpp
- **Enhanced Prompts**:
-  - Code-first system prompts
-  - Language-specific generation instructions
+**Key Features:**
+- **Context Window Intelligence**: Automatic monitoring with percentage-based tracking (80% capacity triggers auto-summarization)
+- **Tool System**: Built-in tools for file operations (read, write, edit), shell commands, and structured output
+- **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution
+- **Session Management**: Automatic session logging with detailed conversation history and token usage
+- **Error Recovery**: Sophisticated error classification and retry logic for recoverable errors
+- **TODO Management**: In-memory TODO list with read/write tools for task tracking

-#### 5. Embedded Provider (`g3-core/providers/embedded`) - NEW
- **Responsibility**: Local model inference using llama.cpp
- **Features**:
-  - GGUF model support (Llama, CodeLlama, Mistral, etc.)
-  - GPU acceleration via CUDA/Metal
-  - Configurable context length and generation parameters
-  - Async-compatible inference without blocking
-  - Thread-safe model access
-  - Stop sequence detection
+**Available Tools:**
+- `shell`: Execute shell commands with streaming output
+- `read_file`: Read file contents with optional character range support
+- `write_file`: Create or overwrite files with content
+- `str_replace`: Apply unified diffs to files with precise editing
+- `final_output`: Signal task completion with detailed summaries
+- `todo_read`: Read the entire TODO list content
+- `todo_write`: Write or overwrite the entire TODO list
+- `mouse_click`: Click the mouse at specific coordinates
+- `type_text`: Type text at the current cursor position
+- `find_element`: Find UI elements by text, role, or attributes
+- `take_screenshot`: Capture screenshots of screen, region, or window
+- `extract_text`: Extract text from images or screen regions using OCR
+- `find_text_on_screen`: Find text visually on screen and return coordinates
+- `list_windows`: List all open windows with IDs and titles

-#### 4. Execution Engine (`g3-execution`) - NEW
- **Responsibility**: Safe code execution
- **Features**:
-  - Multi-language script execution
-  - Sandboxing and security
-  - Resource limits
-  - Output capture and formatting
-  - Error handling and recovery
+### 2. g3-providers: LLM Provider Abstraction

-### Task Types and Language Selection
+**Primary Responsibilities:**
+- Unified interface for multiple LLM providers
+- Provider-specific optimizations and feature support
+- OAuth authentication flows
+- Streaming and non-streaming completion support

-| Task Type | Preferred Language | Use Cases |
-|-----------|-------------------|-----------|
-| Data Processing | Python | CSV/JSON analysis, data transformation |
-| File Operations | Bash/Shell | File manipulation, backups, organization |
-| System Admin | Bash/Shell | Process management, system monitoring |
-| Text Processing | Python/Bash | Log analysis, text transformation |
-| Database | Python/SQL | Data migration, queries, reporting |
-| Image/Media | Python | Image processing, format conversion |
-| Development | Rust | Code generation, project setup |
+**Supported Providers:**
+- **Anthropic**: Claude models via API with native tool calling support
+- **Databricks**: Foundation Model APIs with OAuth and token-based authentication (default provider)
+- **Embedded**: Local models via llama.cpp with GPU acceleration (Metal/CUDA)
+- **Provider Registry**: Dynamic provider management and hot-swapping

-## Implementation Plan
+**Key Features:**
+- **Native Tool Calling**: Full support for structured tool calls where available
+- **Fallback Parsing**: JSON tool call parsing for providers without native support
+- **OAuth Integration**: Built-in OAuth flow for secure provider authentication
+- **Context-Aware**: Provider-specific context length and token limit handling
+- **Streaming Support**: Real-time response streaming with tool call detection

-### Phase 1: Core Refactoring ✅
-1. ✅ Update CLI commands for task-oriented interface
-2. ✅ Enhance system prompts for code-first approach
-3. ✅ Add basic code execution capabilities
-4. ✅ Update interactive mode messaging
+### 3. g3-cli: Command-Line Interface

-### Phase 2: Enhanced Provider Support ✅
-1. ✅ Implement embedded model provider using llama.cpp
-2. ✅ Add GGUF model support for local inference
-3. ✅ Configure GPU acceleration and performance optimization
-4. ✅ Add comprehensive logging and debugging support
+**Primary Responsibilities:**
+- Command-line argument parsing and validation
+- Interactive terminal interface with history support
+- Retro-style terminal UI (80s sci-fi inspired)
+- Autonomous mode with coach-player feedback loops
+- Session management and workspace handling

-### Phase 3: Advanced Features (Future)
-1. Model quantization and optimization
-2. Multi-model ensemble support
-3. Advanced code execution sandboxing
-4. Plugin system for custom providers
-5. Web interface for remote access
+**Execution Modes:**
+- **Single-shot**: Execute one task and exit
+- **Interactive**: REPL-style conversation with the agent (default mode)
+- **Autonomous**: Coach-player feedback loop for complex projects
+- **Retro TUI**: Full-screen terminal interface with real-time updates
+
+**Key Features:**
+- **Multi-line Input**: Support for complex, multi-line prompts with backslash continuation
+- **Context Progress**: Real-time display of token usage and context window status
+- **Error Recovery**: Automatic retry logic for timeout and recoverable errors
+- **History Management**: Persistent command history across sessions
+- **Theme Support**: Customizable color themes for retro mode
+- **Cancellation**: Ctrl+C support for graceful operation cancellation
+
+### 4. g3-execution: Code Execution Engine
+
+**Primary Responsibilities:**
+- Safe execution of shell commands and scripts
+- Streaming output capture and display
+- Multi-language code execution support
+- Error handling and result formatting
+
+**Supported Execution:**
+- **Bash/Shell**: Direct command execution with streaming output (primary use case)
+- **Python**: Script execution via temporary files (legacy support)
+- **JavaScript**: Node.js-based execution (legacy support)
+
+**Key Features:**
+- **Streaming Output**: Real-time command output display
+- **Error Capture**: Comprehensive stderr and stdout handling
+- **Exit Code Tracking**: Proper success/failure detection
+- **Async Execution**: Non-blocking command execution
+- **Output Formatting**: Clean, user-friendly result presentation
+
+### 5. g3-config: Configuration Management
+
+**Primary Responsibilities:**
+- TOML-based configuration file management
+- Environment variable overrides
+- Provider-specific settings and credentials
+- CLI argument integration
+
+**Configuration Hierarchy:**
+1. Default configuration (Databricks provider with OAuth)
+2. Configuration files (`~/.config/g3/config.toml`, `./g3.toml`)
+3. Environment variables (`G3_*`)
+4. CLI arguments (highest priority)
+
+**Key Features:**
+- **Auto-generation**: Creates default configuration files if none exist
+- **Provider Overrides**: Runtime provider and model selection
+- **Validation**: Configuration validation with helpful error messages
+- **Flexible Paths**: Support for shell expansion (`~`, environment variables)
+
+### 6. g3-computer-control: Computer Control & Automation
+
+**Primary Responsibilities:**
+- Cross-platform computer control and automation
+- Mouse and keyboard input simulation
+- Window management and screenshot capture
+- OCR text extraction from images and screen regions
+
+**Platform Support:**
+- **macOS**: Core Graphics, Cocoa, screencapture integration
+- **Linux**: X11/Xtest for input, X11 for window management
+- **Windows**: Win32 APIs for input and window control
+
+**Key Features:**
+- **OCR Integration**: Tesseract-based text extraction from images
+- **Window Management**: List, identify, and capture specific application windows
+- **UI Automation**: Find elements, simulate clicks, type text
+- **Screenshot Capture**: Full screen, regions, or specific windows
+- **Accessibility**: Requires OS-level permissions for automation
+
+## Advanced Features
+
+### Context Window Management
+
+G3 implements sophisticated context window management:
+
+- **Automatic Monitoring**: Tracks token usage with percentage-based thresholds
+- **Smart Summarization**: Auto-triggers at 80% capacity to prevent context overflow
+- **Context Thinning**: Progressive thinning at 50%, 60%, 70%, 80% thresholds - replaces large tool results with file references
+- **Conversation Preservation**: Maintains conversation continuity through intelligent summaries
+- **Provider-Specific Limits**: Adapts to different model context windows (4k to 200k+ tokens)
+- **Cumulative Tracking**: Monitors total token usage across entire sessions
+
+### Error Handling & Resilience
+
+Comprehensive error handling system:
+
+- **Error Classification**: Distinguishes between recoverable and non-recoverable errors
+- **Automatic Retry**: Exponential backoff with jitter for rate limits, timeouts, and server errors
+- **Detailed Logging**: Comprehensive error context including stack traces and session data
+- **Error Persistence**: Saves detailed error logs to `logs/errors/` for analysis
+- **Graceful Degradation**: Continues operation when possible, fails gracefully when not
+
+### Session Management
+
+Automatic session tracking and logging:
+
+- **Session IDs**: Generated based on initial prompts for easy identification
+- **Complete Logs**: Full conversation history, token usage, and timing data
+- **JSON Format**: Structured logs for easy parsing and analysis
+- **Automatic Cleanup**: Organized in `logs/` directory with timestamps
+- **Status Tracking**: Records session completion status (completed, cancelled, error)
+
+### Autonomous Mode
+
+Advanced autonomous operation with coach-player feedback:
+
+- **Requirements-Driven**: Reads `requirements.md` for project specifications
+- **Dual-Agent System**: Separate player (implementation) and coach (review) agents
+- **Iterative Improvement**: Multiple rounds of implementation and feedback
+- **Progress Tracking**: Detailed reporting of turns, token usage, and final status
+- **Workspace Management**: Automatic workspace setup and file organization

 ## Provider Comparison

-| Feature | OpenAI | Anthropic | Embedded |
-|---------|--------|-----------|----------|
+| Feature | Anthropic | Databricks (Default) | Embedded |
+|---------|-----------|------------|----------|
 | **Cost** | Pay per token | Pay per token | Free after download |
 | **Privacy** | Data sent to API | Data sent to API | Completely local |
 | **Performance** | Very fast | Very fast | Depends on hardware |
 | **Model Quality** | Excellent | Excellent | Good (varies by model) |
 | **Offline Support** | No | No | Yes |
-| **Setup Complexity** | API key only | API key only | Model download required |
+| **Setup Complexity** | API key only | OAuth or token | Model download required |
+| **Context Window** | 200k tokens | Varies by model | 4k-32k tokens |
+| **Tool Calling** | Native support | Native support | JSON fallback |
 | **Hardware Requirements** | None | None | 4-16GB RAM, optional GPU |

 ## Configuration Examples

-### Cloud-First Setup
+### Cloud-First Setup (Anthropic)
 ```toml
 [providers]
-default_provider = "openai"
+default_provider = "anthropic"

-[providers.openai]
-api_key = "sk-..."
-model = "gpt-4"
+[providers.anthropic]
+api_key = "sk-ant-..."
+model = "claude-3-5-sonnet-20241022"
+max_tokens = 8192
+temperature = 0.1
 ```

-### Privacy-First Setup  
+### Enterprise Setup (Databricks - Default)
+```toml
+[providers]
+default_provider = "databricks"
+
+[providers.databricks]
+host = "https://your-workspace.cloud.databricks.com"
+model = "databricks-claude-sonnet-4"
+max_tokens = 32000
+temperature = 0.1
+use_oauth = true
+```
+
+### Privacy-First Setup (Local Models)
 ```toml
 [providers]
 default_provider = "embedded"

 [providers.embedded]
-model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
-model_type = "codellama"
+model_path = "~/.cache/g3/models/qwen2.5-7b-instruct-q3_k_m.gguf"
+model_type = "qwen"
+context_length = 32768
+max_tokens = 2048
+temperature = 0.1
 gpu_layers = 32
+threads = 8
 ```

 ### Hybrid Setup
@@ -159,14 +318,109 @@ gpu_layers = 32
 [providers]
 default_provider = "embedded"

-# Use embedded for most tasks
+# Local model for most tasks
 [providers.embedded]
 model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
 model_type = "codellama"
+context_length = 16384
 gpu_layers = 32

-# Fallback to cloud for complex tasks
-[providers.openai]
-api_key = "sk-..."
-model = "gpt-4"
+# Cloud fallback for complex tasks
+[providers.anthropic]
+api_key = "sk-ant-..."
+model = "claude-3-5-sonnet-20241022"
 ```
+
+## Usage Examples
+
+### Single-Shot Mode
+```bash
+g3 "implement a fibonacci function in Rust"
+```
+
+### Interactive Mode
+```bash
+g3
+g3> read the README and suggest improvements
+g3> implement the suggestions you made
+```
+
+### Autonomous Mode
+```bash
+g3 --autonomous --max-turns 10
+# Reads requirements.md and implements iteratively
+```
+
+### Retro TUI Mode
+```bash
+g3 --retro --theme dracula
+# Full-screen terminal interface
+```
+
+## Implementation Details
+
+### Planned Features
+- **Plugin System**: Custom tool and provider plugins
+- **Web Interface**: Browser-based UI for remote access
+- **Model Quantization**: Optimized local model deployment
+- **Multi-Model Ensemble**: Combine multiple models for better results
+- **Advanced Sandboxing**: Enhanced security for code execution
+- **Collaborative Mode**: Multi-user sessions and shared workspaces
+
+### Technical Improvements
+- **Performance Optimization**: Faster streaming and tool execution
+- **Memory Management**: Better handling of large contexts and files
+- **Caching System**: Intelligent caching of model responses and computations
+- **Monitoring**: Built-in metrics and performance monitoring
+- **Testing**: Comprehensive test suite and CI/CD integration
+
+## Development Guidelines
+
+### Code Organization
+- **Modular Design**: Each crate has a single, well-defined responsibility
+- **Trait-Based**: Use traits for abstraction and testability
+- **Error Handling**: Comprehensive error types with context
+- **Documentation**: Inline docs and examples for all public APIs
+- **Testing**: Unit tests, integration tests, and property-based testing
+
+### Performance Considerations
+- **Async-First**: All I/O operations are asynchronous (Tokio runtime)
+- **Streaming**: Real-time response processing where possible
+- **Memory Efficiency**: Careful memory management for large contexts
+- **Caching**: Strategic caching of expensive operations
+- **Profiling**: Regular performance profiling and optimization
+
+This design document reflects the current state of G3 as a mature, production-ready AI coding agent with sophisticated architecture and comprehensive feature set.
+
+## Current Implementation Status
+
+### Fully Implemented
+- ✅ **Core Agent Engine**: Complete with streaming, tool execution, and context management
+- ✅ **Provider System**: Anthropic, Databricks, and Embedded providers with OAuth support
+- ✅ **Tool System**: 13 tools including file ops, shell, TODO management, and computer control
+- ✅ **CLI Interface**: Interactive mode, single-shot mode, retro TUI
+- ✅ **Autonomous Mode**: Coach-player feedback loop with requirements.md processing
+- ✅ **Configuration**: TOML-based config with environment overrides
+- ✅ **Error Handling**: Comprehensive retry logic and error classification
+- ✅ **Session Logging**: Automatic session tracking and JSON logs
+- ✅ **Context Management**: Context thinning (50-80%) and auto-summarization at 80% capacity
+- ✅ **Computer Control**: Cross-platform automation with OCR support
+- ✅ **TODO Management**: In-memory TODO list with read/write tools
+
+### Architecture Highlights
+- **Workspace**: 6 crates with clear separation of concerns
+- **Dependencies**: Modern Rust ecosystem (Tokio, Clap, Serde, etc.)
+- **Streaming**: Real-time response processing with tool call detection
+- **Cross-Platform**: Works on macOS, Linux, and Windows
+- **GPU Support**: Metal acceleration for local models on macOS, CUDA on Linux
+- **OCR Support**: Tesseract integration for text extraction from images
+
+### Key Files
+- `src/main.rs`: main entry point delegating to g3-cli
+- `crates/g3-core/src/lib.rs`: main agent implementation
+- `crates/g3-cli/src/lib.rs`: CLI and interaction modes
+- `crates/g3-providers/src/lib.rs`: provider trait and registry
+- `crates/g3-config/src/lib.rs`: configuration management
+- `crates/g3-execution/src/lib.rs`: code execution engine
+- `crates/g3-computer-control/src/lib.rs`: computer control and automation
+- `crates/g3-computer-control/src/platform/`: platform-specific implementations
--- a/README.md
+++ b/README.md
@@ -1,3 +1,255 @@
-# G3
+# G3 - AI Coding Agent

-An experiment in a code-first AI agent that helps you complete tasks by writing and executing code.
+G3 is a coding AI agent designed to help you complete tasks by writing code and executing commands. Built in Rust, it provides a flexible architecture for interacting with various Large Language Model (LLM) providers while offering powerful code generation and task automation capabilities.
+
+## Key Features
+
+- **Multiple LLM Providers**: Anthropic (Claude), Databricks, OpenAI, and local models via llama.cpp
+- **Autonomous Mode**: Coach-player feedback loop for complex tasks
+- **Intelligent Context Management**: Auto-summarization and context thinning at 50-80% thresholds
+- **Rich Tool Ecosystem**: File operations, shell commands, computer control, browser automation
+- **Streaming Responses**: Real-time output with tool call detection
+- **Error Recovery**: Automatic retry logic with exponential backoff
+
+## Getting Started
+
+```bash
+# Build the project
+cargo build --release
+
+# Execute a single task
+g3 "implement a function to calculate fibonacci numbers"
+
+# Start autonomous mode with interactive requirements
+g3 --autonomous --interactive-requirements
+```
+
+## Configuration
+
+Create `~/.config/g3/config.toml`:
+
+```toml
+[providers]
+default_provider = "databricks"
+
+[providers.anthropic]
+api_key = "sk-ant-..."
+model = "claude-3-5-sonnet-20241022"
+max_tokens = 4096
+
+[providers.databricks]
+host = "https://your-workspace.cloud.databricks.com"
+model = "databricks-meta-llama-3-1-70b-instruct"
+max_tokens = 4096
+use_oauth = true
+
+[agent]
+max_context_length = 8192
+enable_streaming = true
+
+# Optional: Use different models for coach and player in autonomous mode
+[autonomous]
+coach_provider = "anthropic"
+coach_model = "claude-3-5-sonnet-20241022"  # Thorough review
+player_provider = "databricks"
+player_model = "databricks-meta-llama-3-1-70b-instruct"  # Fast execution
+```
+
+## Autonomous Mode (Coach-Player Loop)
+
+G3 features an autonomous mode where two agents collaborate:
+- **Player Agent**: Executes tasks and implements solutions
+- **Coach Agent**: Reviews work and provides feedback
+
+### Option 1: Interactive Requirements with AI Enhancement (Recommended)
+
+```bash
+g3 --autonomous --interactive-requirements
+```
+
+**How it works:**
+1. Describe what you want to build (can be brief)
+2. Press **Ctrl+D** (Unix/Mac) or **Ctrl+Z** (Windows)
+3. AI enhances your input into a structured requirements document
+4. Review the enhanced requirements
+5. Choose to proceed, edit manually, or cancel
+6. If accepted, autonomous mode starts automatically
+
+**Example:**
+```
+You type: "build a todo app with cli in python"
+
+AI generates:
+# Todo List CLI Application
+
+## Overview
+A command-line todo list application built in Python...
+
+## Functional Requirements
+1. Add tasks with descriptions
+2. Mark tasks as complete
+3. Delete tasks
+...
+```
+
+### Option 2: Direct Requirements
+
+```bash
+g3 --autonomous --requirements "Build a REST API with CRUD operations for user management"
+```
+
+### Option 3: Requirements File
+
+Create `requirements.md` in your workspace:
+
+```markdown
+# Project Requirements
+
+1. Create a REST API with user endpoints
+2. Use SQLite for storage
+3. Include input validation
+4. Write unit tests
+```
+
+Then run:
+
+```bash
+g3 --autonomous
+```
+
+### Why Different Models for Coach and Player?
+
+Configure different models in the `[autonomous]` section to:
+- **Optimize Cost**: Use cheaper model for execution, expensive for review
+- **Optimize Speed**: Use fast model for iteration, thorough for validation
+- **Specialize**: Leverage provider strengths (e.g., Claude for analysis, Llama for code)
+
+If not configured, both agents use the `default_provider` and its model.
+
+## Command-Line Options
+
+```bash
+# Autonomous mode
+g3 --autonomous --interactive-requirements
+g3 --autonomous --requirements "Your requirements"
+g3 --autonomous --max-turns 10
+
+# Single-shot mode
+g3 "your task here"
+
+# Options
+--workspace <DIR>          # Set workspace directory
+--provider <NAME>          # Override provider (anthropic, databricks, openai)
+--model <NAME>             # Override model
+--quiet                    # Disable log files
+--webdriver                # Enable browser automation
+--show-prompt              # Show system prompt
+--show-code                # Show generated code
+```
+
+## Architecture Overview
+
+G3 is organized as a Rust workspace with multiple crates:
+
+- **g3-core**: Agent engine, context management, tool system, streaming parser
+- **g3-providers**: LLM provider abstraction (Anthropic, Databricks, OpenAI, local models)
+- **g3-config**: Configuration management
+- **g3-execution**: Task execution framework
+- **g3-computer-control**: Mouse/keyboard automation, OCR, screenshots
+- **g3-cli**: Command-line interface
+
+### Key Capabilities
+
+**Intelligent Context Management**
+- Automatic context window monitoring with percentage-based tracking
+- Smart auto-summarization when approaching token limits
+- Context thinning at 50%, 60%, 70%, 80% thresholds
+- Dynamic token allocation (4k to 200k+ tokens)
+
+**Tool Ecosystem**
+- File operations (read, write, edit with line-range precision)
+- Shell command execution
+- TODO management
+- Computer control (experimental): mouse, keyboard, OCR, screenshots
+- Browser automation via WebDriver (Safari)
+
+**Error Handling**
+- Automatic retry logic with exponential backoff
+- Recoverable error detection (rate limits, network issues, timeouts)
+- Detailed error logging to `logs/errors/`
+
+## WebDriver Browser Automation
+
+**One-Time Setup** (macOS):
+
+```bash
+# Enable Safari Remote Automation
+safaridriver --enable  # Requires password
+
+# Or via Safari UI:
+# Safari → Preferences → Advanced → Show Develop menu
+# Then: Develop → Allow Remote Automation
+```
+
+**Usage**:
+
+```bash
+g3 --webdriver "scrape the top stories from Hacker News"
+```
+
+See [docs/webdriver-setup.md](docs/webdriver-setup.md) for detailed setup.
+
+## Computer Control (Experimental)
+
+Enable in config:
+
+```toml
+[computer_control]
+enabled = true
+require_confirmation = true
+```
+
+Grant accessibility permissions:
+- **macOS**: System Preferences → Security & Privacy → Accessibility
+- **Linux**: Ensure X11 or Wayland access
+- **Windows**: Run as administrator (first time)
+
+**Available Tools**: `mouse_click`, `type_text`, `find_element`, `take_screenshot`, `extract_text`, `find_text_on_screen`, `list_windows`
+
+## Use Cases
+
+- Automated code generation and refactoring
+- File manipulation and project scaffolding
+- System administration tasks
+- Data processing and transformation
+- API integration and testing
+- Documentation generation
+- Complex multi-step workflows
+- Desktop application automation
+
+## Session Logs
+
+G3 automatically saves session logs to `logs/` directory:
+- Complete conversation history
+- Token usage statistics
+- Timestamps and session status
+
+Disable with `--quiet` flag.
+
+## Technology Stack
+
+- **Language**: Rust (2021 edition)
+- **Async Runtime**: Tokio
+- **HTTP Client**: Reqwest
+- **Serialization**: Serde
+- **CLI Framework**: Clap
+- **Logging**: Tracing
+- **Local Models**: llama.cpp with Metal acceleration
+
+## License
+
+MIT License - see LICENSE file for details
+
+## Contributing
+
+Contributions welcome! Please see CONTRIBUTING.md for guidelines.
--- a/19
+++ b/19
@@ -0,0 +1,19 @@
+next tasks
+
+x get something working with autonomous mode
+- g3d
+- bug where it prints everything in a conversation turn all over again before final_output
+x ui abstraction from core
+- context token counting bug
+- embedded model
+  - prompt rewriting
+  - generates status messages "ruffling feathers..."
+  - project description?
+- treesitter + friends
+x error where it just gives up turn
+- "project" behaviors (read readme first)
+- advance project mgmt
+  - git for reverting
+  - swarm
+  - ui tests / computer controller
+
--- a/config.example.toml
+++ b/config.example.toml
@@ -1,38 +1,20 @@
-# Example configuration file for G3
-# Copy to ~/.config/g3/config.toml and customize
-
 [providers]
-default_provider = "embedded"
+default_provider = "databricks"

-[providers.openai]
-# Get your API key from https://platform.openai.com/api-keys
-api_key = "sk-your-openai-api-key-here"
-model = "gpt-4"
-# Optional: custom base URL for OpenAI-compatible APIs
-# base_url = "https://api.openai.com/v1"
-max_tokens = 2048
-temperature = 0.1
-
-[providers.anthropic]
-# Get your API key from https://console.anthropic.com/
-api_key = "your-anthropic-api-key-here"
-model = "claude-3-5-sonnet-20241022"
+[providers.databricks]
+host = "https://your-workspace.cloud.databricks.com"
+# token = "your-databricks-token"  # Optional - will use OAuth if not provided
+model = "databricks-claude-sonnet-4"
 max_tokens = 4096
 temperature = 0.1
-
-[providers.embedded]
-# Path to your GGUF model file
-model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
-model_type = "codellama"
-context_length = 16384  # Use CodeLlama's full context capability
-max_tokens = 2048  # Default fallback, but will be calculated dynamically
-temperature = 0.1
-# Number of layers to offload to GPU (0 for CPU only)
-gpu_layers = 32
-# Number of CPU threads to use
-threads = 8
+use_oauth = true

 [agent]
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
+
+[computer_control]
+enabled = false  # Set to true to enable computer control (requires OS permissions)
+require_confirmation = true
+max_actions_per_second = 5
--- a/crates/g3-cli/Cargo.toml
+++ b/crates/g3-cli/Cargo.toml
@@ -12,9 +12,13 @@ tokio = { workspace = true }
 anyhow = { workspace = true }
 tracing = { workspace = true }
 tracing-subscriber = { workspace = true, features = ["env-filter"] }
-serde = { workspace = true }
+serde = { workspace = true, features = ["derive"] }
 serde_json = { workspace = true }
 rustyline = "17.0.1"
 dirs = "5.0"
 tokio-util = "0.7"
 indicatif = "0.17"
+chrono = { version = "0.4", features = ["serde"] }
+crossterm = "0.29.0"
+ratatui = "0.29"
+termimad = "0.34.0"
--- a/crates/g3-cli/src/lib.rs
+++ b/crates/g3-cli/src/lib.rs
--- a/crates/g3-cli/src/retro_tui.rs
+++ b/crates/g3-cli/src/retro_tui.rs
--- a/crates/g3-cli/src/theme.rs
+++ b/crates/g3-cli/src/theme.rs
@@ -0,0 +1,147 @@
+use ratatui::style::Color;
+use serde::{Deserialize, Serialize};
+use std::fs;
+use std::path::Path;
+use anyhow::Result;
+
+/// Color theme configuration for the retro TUI
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct ColorTheme {
+    /// Name of the theme
+    pub name: String,
+    
+    /// Main terminal text color (for general output)
+    pub terminal_green: ColorValue,
+    
+    /// Warning/system messages color
+    pub terminal_amber: ColorValue,
+    
+    /// Border and dim text color
+    pub terminal_dim_green: ColorValue,
+    
+    /// Background color
+    pub terminal_bg: ColorValue,
+    
+    /// Highlight/emphasis color
+    pub terminal_cyan: ColorValue,
+    
+    /// Error/negative diff color
+    pub terminal_red: ColorValue,
+    
+    /// READY status color
+    pub terminal_pale_blue: ColorValue,
+    
+    /// PROCESSING status color
+    pub terminal_dark_amber: ColorValue,
+    
+    /// Bright/punchy text color
+    pub terminal_white: ColorValue,
+    
+    /// Success status color (for tool completions)
+    pub terminal_success: ColorValue,
+}
+
+/// Represents a color value that can be serialized/deserialized
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(untagged)]
+pub enum ColorValue {
+    /// RGB color with r, g, b components
+    Rgb { r: u8, g: u8, b: u8 },
+    /// Named color
+    Named(String),
+}
+
+impl ColorValue {
+    /// Convert to ratatui Color
+    pub fn to_color(&self) -> Color {
+        match self {
+            ColorValue::Rgb { r, g, b } => Color::Rgb(*r, *g, *b),
+            ColorValue::Named(name) => match name.to_lowercase().as_str() {
+                "black" => Color::Black,
+                "red" => Color::Red,
+                "green" => Color::Green,
+                "yellow" => Color::Yellow,
+                "blue" => Color::Blue,
+                "magenta" => Color::Magenta,
+                "cyan" => Color::Cyan,
+                "gray" | "grey" => Color::Gray,
+                "darkgray" | "darkgrey" => Color::DarkGray,
+                "lightred" => Color::LightRed,
+                "lightgreen" => Color::LightGreen,
+                "lightyellow" => Color::LightYellow,
+                "lightblue" => Color::LightBlue,
+                "lightmagenta" => Color::LightMagenta,
+                "lightcyan" => Color::LightCyan,
+                "white" => Color::White,
+                _ => Color::White, // Default fallback
+            },
+        }
+    }
+}
+
+impl ColorTheme {
+    /// Load a theme from a JSON file
+    pub fn from_file<P: AsRef<Path>>(path: P) -> Result<Self> {
+        let content = fs::read_to_string(path)?;
+        let theme: ColorTheme = serde_json::from_str(&content)?;
+        Ok(theme)
+    }
+    
+    /// Get the default retro sci-fi theme (inspired by Alien terminals)
+    pub fn default() -> Self {
+        ColorTheme {
+            name: "Retro Sci-Fi".to_string(),
+            terminal_green: ColorValue::Rgb { r: 136, g: 244, b: 152 },
+            terminal_amber: ColorValue::Rgb { r: 242, g: 204, b: 148 },
+            terminal_dim_green: ColorValue::Rgb { r: 154, g: 174, b: 135 },
+            terminal_bg: ColorValue::Rgb { r: 0, g: 10, b: 0 },
+            terminal_cyan: ColorValue::Rgb { r: 0, g: 255, b: 255 },
+            terminal_red: ColorValue::Rgb { r: 239, g: 119, b: 109 },
+            terminal_pale_blue: ColorValue::Rgb { r: 173, g: 234, b: 251 },
+            terminal_dark_amber: ColorValue::Rgb { r: 204, g: 119, b: 34 },
+            terminal_white: ColorValue::Rgb { r: 218, g: 218, b: 219 },
+            terminal_success: ColorValue::Rgb { r: 136, g: 244, b: 152 }, // Same as terminal_green for retro theme
+        }
+    }
+    
+    /// Get the Dracula theme
+    pub fn dracula() -> Self {
+        ColorTheme {
+            name: "Dracula".to_string(),
+            terminal_green: ColorValue::Rgb { r: 248, g: 248, b: 242 }, // Use Dracula foreground (white) for main text
+            terminal_amber: ColorValue::Rgb { r: 255, g: 184, b: 108 }, // Dracula orange
+            terminal_dim_green: ColorValue::Rgb { r: 98, g: 114, b: 164 }, // Dracula comment
+            terminal_bg: ColorValue::Rgb { r: 40, g: 42, b: 54 },      // Dracula background
+            terminal_cyan: ColorValue::Rgb { r: 139, g: 233, b: 253 },  // Dracula cyan
+            terminal_red: ColorValue::Rgb { r: 255, g: 85, b: 85 },     // Dracula red
+            terminal_pale_blue: ColorValue::Rgb { r: 189, g: 147, b: 249 }, // Dracula purple
+            terminal_dark_amber: ColorValue::Rgb { r: 255, g: 121, b: 198 }, // Dracula pink
+            terminal_white: ColorValue::Rgb { r: 248, g: 248, b: 242 }, // Dracula foreground
+            terminal_success: ColorValue::Rgb { r: 80, g: 250, b: 123 }, // Dracula green for success
+        }
+    }
+    
+    /// Get a theme by name or from file
+    pub fn load(theme_name: Option<&str>) -> Result<Self> {
+        match theme_name {
+            None => Ok(Self::default()),
+            Some("default") | Some("retro") => Ok(Self::default()),
+            Some("dracula") => Ok(Self::dracula()),
+            Some(path) => {
+                // Try to load from file
+                if Path::new(path).exists() {
+                    Self::from_file(path)
+                } else {
+                    // Try to find in standard locations
+                    let home = dirs::home_dir().ok_or_else(|| anyhow::anyhow!("Could not find home directory"))?;
+                    let theme_file = home.join(".config").join("g3").join("themes").join(format!("{}.json", path));
+                    if theme_file.exists() {
+                        Self::from_file(theme_file)
+                    } else {
+                        Err(anyhow::anyhow!("Theme '{}' not found", path))
+                    }
+                }
+            }
+        }
+    }
+}
--- a/crates/g3-cli/src/tui.rs
+++ b/crates/g3-cli/src/tui.rs
@@ -0,0 +1,126 @@
+use crossterm::style::Color;
+use crossterm::style::{SetForegroundColor, ResetColor};
+use termimad::MadSkin;
+
+/// Simple output handler with markdown support
+pub struct SimpleOutput {
+    mad_skin: MadSkin,
+}
+
+impl SimpleOutput {
+    pub fn new() -> Self {
+        let mut mad_skin = MadSkin::default();
+        // Dracula color scheme
+        // Background: #282a36, Foreground: #f8f8f2
+        // Colors: Cyan #8be9fd, Green #50fa7b, Orange #ffb86c, Pink #ff79c6, Purple #bd93f9, Red #ff5555, Yellow #f1fa8c
+        
+        mad_skin.set_headers_fg(Color::Rgb { r: 189, g: 147, b: 249 }); // Purple for headers
+        mad_skin.bold.set_fg(Color::Rgb { r: 255, g: 121, b: 198 });    // Pink for bold
+        mad_skin.italic.set_fg(Color::Rgb { r: 139, g: 233, b: 253 });  // Cyan for italic
+        mad_skin.code_block.set_bg(Color::Rgb { r: 68, g: 71, b: 90 }); // Dracula background variant
+        mad_skin.code_block.set_fg(Color::Rgb { r: 80, g: 250, b: 123 }); // Green for code text
+        mad_skin.inline_code.set_bg(Color::Rgb { r: 68, g: 71, b: 90 }); // Same background for inline code
+        mad_skin.inline_code.set_fg(Color::Rgb { r: 241, g: 250, b: 140 }); // Yellow for inline code
+        mad_skin.quote_mark.set_fg(Color::Rgb { r: 98, g: 114, b: 164 }); // Comment purple for quote marks
+        mad_skin.strikeout.set_fg(Color::Rgb { r: 255, g: 85, b: 85 });  // Red for strikethrough
+        
+        Self { mad_skin }
+    }
+
+    /// Detect if text contains markdown formatting
+    fn has_markdown(&self, text: &str) -> bool {
+        // Check for common markdown patterns
+        text.contains("**") ||
+        text.contains("```") ||
+        text.contains("`") ||
+        text.lines().any(|line| {
+            let trimmed = line.trim();
+            trimmed.starts_with('#') ||
+            trimmed.starts_with("- ") ||
+            trimmed.starts_with("* ") ||
+            trimmed.starts_with("+ ") ||
+            (trimmed.len() > 2 && 
+             trimmed.chars().next().map_or(false, |c| c.is_ascii_digit()) &&
+             trimmed.chars().nth(1) == Some('.') &&
+             trimmed.chars().nth(2) == Some(' ')) ||
+            (trimmed.contains('[') && trimmed.contains("]("))
+        }) ||
+        (text.matches('*').count() >= 2 && !text.contains("/*") && !text.contains("*/"))
+    }
+
+    pub fn print(&self, text: &str) {
+        println!("{}", text);
+    }
+
+    /// Smart print that automatically detects and renders markdown
+    pub fn print_smart(&self, text: &str) {
+        if self.has_markdown(text) {
+            self.print_markdown(text);
+        } else {
+            self.print(text);
+        }
+    }
+
+    pub fn print_markdown(&self, markdown: &str) {
+        self.mad_skin.print_text(markdown);
+    }
+
+    pub fn _print_status(&self, status: &str) {
+        println!("📊 {}", status);
+    }
+
+    pub fn print_context(&self, used: u32, total: u32, percentage: f32) {
+        let bar_width: usize = 10;
+        let filled_width = ((percentage / 100.0) * bar_width as f32) as usize;
+        let empty_width = bar_width.saturating_sub(filled_width);
+
+        let filled_chars = "●".repeat(filled_width);
+        let empty_chars = "○".repeat(empty_width);
+        
+        // Determine color based on percentage
+        let color = if percentage < 60.0 {
+            crossterm::style::Color::Green
+        } else if percentage < 80.0 {
+            crossterm::style::Color::Yellow
+        } else {
+            crossterm::style::Color::Red
+        };
+
+        // Print with colored progress bar
+        print!("Context: ");
+        print!("{}", SetForegroundColor(color));
+        print!("{}{}", filled_chars, empty_chars);
+        print!("{}", ResetColor);
+        println!(" {:.1}% | {}/{} tokens", percentage, used, total);
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_markdown_detection() {
+        let output = SimpleOutput::new();
+        
+        // Should detect markdown
+        assert!(output.has_markdown("**bold text**"));
+        assert!(output.has_markdown("`code`"));
+        assert!(output.has_markdown("```\ncode block\n```"));
+        assert!(output.has_markdown("# Header"));
+        assert!(output.has_markdown("- list item"));
+        assert!(output.has_markdown("* list item"));
+        assert!(output.has_markdown("+ list item"));
+        assert!(output.has_markdown("1. numbered item"));
+        assert!(output.has_markdown("[link](url)"));
+        assert!(output.has_markdown("*italic* text"));
+        
+        // Should NOT detect markdown
+        assert!(!output.has_markdown("plain text"));
+        assert!(!output.has_markdown("file.txt"));
+        assert!(!output.has_markdown("/* comment */"));
+        assert!(!output.has_markdown("just one * asterisk"));
+        assert!(!output.has_markdown("📁 Workspace: /path/to/dir"));
+        assert!(!output.has_markdown("✅ Success message"));
+    }
+}
--- a/crates/g3-cli/src/ui_writer_impl.rs
+++ b/crates/g3-cli/src/ui_writer_impl.rs
@@ -0,0 +1,534 @@
+use crate::retro_tui::RetroTui;
+use g3_core::ui_writer::UiWriter;
+use std::io::{self, Write};
+use std::sync::Mutex;
+use std::time::Instant;
+
+/// Console implementation of UiWriter that prints to stdout
+pub struct ConsoleUiWriter {
+    current_tool_name: Mutex<Option<String>>,
+    current_tool_args: Mutex<Vec<(String, String)>>,
+    current_output_line: Mutex<Option<String>>,
+    output_line_printed: Mutex<bool>,
+    in_todo_tool: Mutex<bool>,
+}
+
+impl ConsoleUiWriter {
+    pub fn new() -> Self {
+        Self {
+            current_tool_name: Mutex::new(None),
+            current_tool_args: Mutex::new(Vec::new()),
+            current_output_line: Mutex::new(None),
+            output_line_printed: Mutex::new(false),
+            in_todo_tool: Mutex::new(false),
+        }
+    }
+
+    fn print_todo_line(&self, line: &str) {
+        // Transform and print todo list lines elegantly
+        let trimmed = line.trim();
+        
+        // Skip the "📝 TODO list:" prefix line
+        if trimmed.starts_with("📝 TODO list:") || trimmed == "📝 TODO list is empty" {
+            return;
+        }
+        
+        // Handle empty lines
+        if trimmed.is_empty() {
+            println!();
+            return;
+        }
+        
+        // Detect indentation level
+        let indent_count = line.chars().take_while(|c| c.is_whitespace()).count();
+        let indent = "  ".repeat(indent_count / 2); // Convert spaces to visual indent
+        
+        // Format based on line type
+        if trimmed.starts_with("- [ ]") {
+            // Incomplete task
+            let task = trimmed.strip_prefix("- [ ]").unwrap_or(trimmed).trim();
+            println!("{}☐ {}", indent, task);
+        } else if trimmed.starts_with("- [x]") || trimmed.starts_with("- [X]") {
+            // Completed task
+            let task = trimmed.strip_prefix("- [x]")
+                .or_else(|| trimmed.strip_prefix("- [X]"))
+                .unwrap_or(trimmed)
+                .trim();
+            println!("{}\x1b[2m☑ {}\x1b[0m", indent, task);
+        } else if trimmed.starts_with("- ") {
+            // Regular bullet point
+            let item = trimmed.strip_prefix("- ").unwrap_or(trimmed).trim();
+            println!("{}• {}", indent, item);
+        } else if trimmed.starts_with("# ") {
+            // Heading
+            let heading = trimmed.strip_prefix("# ").unwrap_or(trimmed).trim();
+            println!("\n\x1b[1m{}\x1b[0m", heading);
+        } else if trimmed.starts_with("## ") {
+            // Subheading
+            let subheading = trimmed.strip_prefix("## ").unwrap_or(trimmed).trim();
+            println!("\n\x1b[1m{}\x1b[0m", subheading);
+        } else if trimmed.starts_with("**") && trimmed.ends_with("**") {
+            // Bold text (section marker)
+            let text = trimmed.trim_start_matches("**").trim_end_matches("**");
+            println!("{}\x1b[1m{}\x1b[0m", indent, text);
+        } else {
+            // Regular text or note
+            println!("{}{}", indent, trimmed);
+        }
+    }
+}
+
+impl UiWriter for ConsoleUiWriter {
+    fn print(&self, message: &str) {
+        print!("{}", message);
+    }
+
+    fn println(&self, message: &str) {
+        println!("{}", message);
+    }
+
+    fn print_inline(&self, message: &str) {
+        print!("{}", message);
+        let _ = io::stdout().flush();
+    }
+
+    fn print_system_prompt(&self, prompt: &str) {
+        println!("🔍 System Prompt:");
+        println!("================");
+        println!("{}", prompt);
+        println!("================");
+        println!();
+    }
+
+    fn print_context_status(&self, message: &str) {
+        println!("{}", message);
+    }
+
+    fn print_tool_header(&self, tool_name: &str) {
+        // Store the tool name and clear args for collection
+        *self.current_tool_name.lock().unwrap() = Some(tool_name.to_string());
+        self.current_tool_args.lock().unwrap().clear();
+        
+        // Check if this is a todo tool call
+        let is_todo = tool_name == "todo_read" || tool_name == "todo_write";
+        *self.in_todo_tool.lock().unwrap() = is_todo;
+        
+        // For todo tools, we'll skip the normal header and print a custom one later
+        if is_todo {
+            return;
+        }
+    }
+
+    fn print_tool_arg(&self, key: &str, value: &str) {
+        // Collect arguments instead of printing immediately
+        // Filter out any keys that look like they might be agent message content
+        // (e.g., keys that are suspiciously long or contain message-like content)
+        let is_valid_arg_key = key.len() < 50
+            && !key.contains('\n')
+            && !key.contains("I'll")
+            && !key.contains("Let me")
+            && !key.contains("Here's")
+            && !key.contains("I can");
+
+        if is_valid_arg_key {
+            self.current_tool_args
+                .lock()
+                .unwrap()
+                .push((key.to_string(), value.to_string()));
+        }
+    }
+
+    fn print_tool_output_header(&self) {
+        // Skip normal header for todo tools
+        if *self.in_todo_tool.lock().unwrap() {
+            println!(); // Just add a newline
+            return;
+        }
+        
+        println!();
+        // Now print the tool header with the most important arg in bold green
+        if let Some(tool_name) = self.current_tool_name.lock().unwrap().as_ref() {
+            let args = self.current_tool_args.lock().unwrap();
+
+            // Find the most important argument - prioritize file_path if available
+            let important_arg = args
+                .iter()
+                .find(|(k, _)| k == "file_path")
+                .or_else(|| args.iter().find(|(k, _)| k == "command" || k == "path"))
+                .or_else(|| args.first());
+
+            if let Some((_, value)) = important_arg {
+                // For multi-line values, only show the first line
+                let first_line = value.lines().next().unwrap_or("");
+
+                // Truncate long values for display
+                let display_value = if first_line.len() > 80 {
+                    format!("{}...", &first_line[..77])
+                } else {
+                    first_line.to_string()
+                };
+
+                // Add range information for read_file tool calls
+                let header_suffix = if tool_name == "read_file" {
+                    // Check if start or end parameters are present
+                    let has_start = args.iter().any(|(k, _)| k == "start");
+                    let has_end = args.iter().any(|(k, _)| k == "end");
+                    
+                    if has_start || has_end {
+                        let start_val = args.iter().find(|(k, _)| k == "start").map(|(_, v)| v.as_str()).unwrap_or("0");
+                        let end_val = args.iter().find(|(k, _)| k == "end").map(|(_, v)| v.as_str()).unwrap_or("end");
+                        format!(" [{}..{}]", start_val, end_val)
+                    } else {
+                        String::new()
+                    }
+                } else {
+                    String::new()
+                };
+
+                // Print with bold green tool name, purple (non-bold) for pipe and args
+                println!("┌─\x1b[1;32m {}\x1b[0m\x1b[35m | {}{}\x1b[0m", tool_name, display_value, header_suffix);
+            } else {
+                // Print with bold green formatting using ANSI escape codes
+                println!("┌─\x1b[1;32m {}\x1b[0m", tool_name);
+            }
+        }
+    }
+
+    fn update_tool_output_line(&self, line: &str) {
+        let mut current_line = self.current_output_line.lock().unwrap();
+        let mut line_printed = self.output_line_printed.lock().unwrap();
+
+        // If we've already printed a line, clear it first
+        if *line_printed {
+            // Move cursor up one line and clear it
+            print!("\x1b[1A\x1b[2K");
+        }
+
+        // Print the new line
+        println!("│ \x1b[2m{}\x1b[0m", line);
+        let _ = io::stdout().flush();
+
+        // Update state
+        *current_line = Some(line.to_string());
+        *line_printed = true;
+    }
+
+    fn print_tool_output_line(&self, line: &str) {
+        // Special handling for todo tools
+        if *self.in_todo_tool.lock().unwrap() {
+            self.print_todo_line(line);
+            return;
+        }
+        
+        println!("│ \x1b[2m{}\x1b[0m", line);
+    }
+
+    fn print_tool_output_summary(&self, count: usize) {
+        // Skip for todo tools
+        if *self.in_todo_tool.lock().unwrap() {
+            return;
+        }
+        
+        println!(
+            "│ \x1b[2m({} line{})\x1b[0m",
+            count,
+            if count == 1 { "" } else { "s" }
+        );
+    }
+
+    fn print_tool_timing(&self, duration_str: &str) {
+        // For todo tools, just print a simple completion message
+        if *self.in_todo_tool.lock().unwrap() {
+            println!();
+            *self.in_todo_tool.lock().unwrap() = false;
+            return;
+        }
+        
+        // Parse the duration string to determine color
+        // Format is like "1.5s", "500ms", "2m 30.0s"
+        let color_code = if duration_str.ends_with("ms") {
+            // Milliseconds - use default color (< 1s)
+            ""
+        } else if duration_str.contains('m') {
+            // Contains minutes
+            // Extract minutes value
+            if let Some(m_pos) = duration_str.find('m') {
+                if let Ok(minutes) = duration_str[..m_pos].trim().parse::<u32>() {
+                    if minutes >= 5 {
+                        "\x1b[31m" // Red for >= 5 minutes
+                    } else {
+                        "\x1b[38;5;208m" // Orange for >= 1 minute but < 5 minutes
+                    }
+                } else {
+                    "" // Default color if parsing fails
+                }
+            } else {
+                "" // Default color if 'm' not found (shouldn't happen)
+            }
+        } else if duration_str.ends_with('s') {
+            // Seconds only
+            if let Some(s_value) = duration_str.strip_suffix('s') {
+                if let Ok(seconds) = s_value.trim().parse::<f64>() {
+                    if seconds >= 1.0 {
+                        "\x1b[33m" // Yellow for >= 1 second
+                    } else {
+                        "" // Default color for < 1 second
+                    }
+                } else {
+                    "" // Default color if parsing fails
+                }
+            } else {
+                "" // Default color
+            }
+        } else {
+            // Milliseconds or other format - use default color
+            ""
+        };
+
+        println!("└─ ⚡️ {}{}\x1b[0m", color_code, duration_str);
+        println!();
+        // Clear the stored tool info
+        *self.current_tool_name.lock().unwrap() = None;
+        self.current_tool_args.lock().unwrap().clear();
+        *self.current_output_line.lock().unwrap() = None;
+        *self.output_line_printed.lock().unwrap() = false;
+    }
+
+    fn print_agent_prompt(&self) {
+        let _ = io::stdout().flush();
+    }
+
+    fn print_agent_response(&self, content: &str) {
+        print!("{}", content);
+        let _ = io::stdout().flush();
+    }
+
+    fn notify_sse_received(&self) {
+        // No-op for console - we don't track SSEs in console mode
+    }
+
+    fn flush(&self) {
+        let _ = io::stdout().flush();
+    }
+}
+
+/// RetroTui implementation of UiWriter that sends output to the TUI
+pub struct RetroTuiWriter {
+    tui: RetroTui,
+    current_tool_name: Mutex<Option<String>>,
+    current_tool_output: Mutex<Vec<String>>,
+    current_tool_start: Mutex<Option<Instant>>,
+    current_tool_caption: Mutex<String>,
+}
+
+impl RetroTuiWriter {
+    pub fn new(tui: RetroTui) -> Self {
+        Self {
+            tui,
+            current_tool_name: Mutex::new(None),
+            current_tool_output: Mutex::new(Vec::new()),
+            current_tool_start: Mutex::new(None),
+            current_tool_caption: Mutex::new(String::new()),
+        }
+    }
+}
+
+impl UiWriter for RetroTuiWriter {
+    fn print(&self, message: &str) {
+        self.tui.output(message);
+    }
+
+    fn println(&self, message: &str) {
+        self.tui.output(message);
+    }
+
+    fn print_inline(&self, message: &str) {
+        // For inline printing, we'll just append to the output
+        self.tui.output(message);
+    }
+
+    fn print_system_prompt(&self, prompt: &str) {
+        self.tui.output("🔍 System Prompt:");
+        self.tui.output("================");
+        for line in prompt.lines() {
+            self.tui.output(line);
+        }
+        self.tui.output("================");
+        self.tui.output("");
+    }
+
+    fn print_context_status(&self, message: &str) {
+        self.tui.output(message);
+    }
+
+    fn print_tool_header(&self, tool_name: &str) {
+        // Start collecting tool output
+        *self.current_tool_start.lock().unwrap() = Some(Instant::now());
+        *self.current_tool_name.lock().unwrap() = Some(tool_name.to_string());
+        self.current_tool_output.lock().unwrap().clear();
+        self.current_tool_output
+            .lock()
+            .unwrap()
+            .push(format!("Tool: {}", tool_name));
+
+        // Initialize caption
+        *self.current_tool_caption.lock().unwrap() = String::new();
+    }
+
+    fn print_tool_arg(&self, key: &str, value: &str) {
+        // Filter out any keys that look like they might be agent message content
+        // (e.g., keys that are suspiciously long or contain message-like content)
+        let is_valid_arg_key = key.len() < 50
+            && !key.contains('\n')
+            && !key.contains("I'll")
+            && !key.contains("Let me")
+            && !key.contains("Here's")
+            && !key.contains("I can");
+
+        if is_valid_arg_key {
+            self.current_tool_output
+                .lock()
+                .unwrap()
+                .push(format!("{}: {}", key, value));
+        }
+
+        // Build caption from first argument (usually the most important one)
+        let mut caption = self.current_tool_caption.lock().unwrap();
+        if caption.is_empty() && (key == "file_path" || key == "command" || key == "path") {
+            // Truncate long values for the caption
+            let truncated = if value.len() > 50 {
+                format!("{}...", &value[..47])
+            } else {
+                value.to_string()
+            };
+            
+            // Add range information for read_file tool calls
+            let tool_name = self.current_tool_name.lock().unwrap();
+            let range_suffix = if tool_name.as_ref().map_or(false, |name| name == "read_file") {
+                // We need to check if start/end args will be provided - for now just check if this is a partial read
+                // This is a simplified approach since we're building the caption incrementally
+                String::new() // We'll handle this in print_tool_output_header instead
+            } else {
+                String::new()
+            };
+            
+            *caption = format!("{}{}", truncated, range_suffix);
+        }
+    }
+
+    fn print_tool_output_header(&self) {
+        // This is called right before tool execution starts
+        // Send the initial tool header to the TUI now
+        if let Some(tool_name) = self.current_tool_name.lock().unwrap().as_ref() {
+            let mut caption = self.current_tool_caption.lock().unwrap().clone();
+            
+            // Add range information for read_file tool calls
+            if tool_name == "read_file" {
+                // Check the tool output for start/end parameters
+                let output = self.current_tool_output.lock().unwrap();
+                let has_start = output.iter().any(|line| line.starts_with("start:"));
+                let has_end = output.iter().any(|line| line.starts_with("end:"));
+                
+                if has_start || has_end {
+                    let start_val = output.iter().find(|line| line.starts_with("start:")).map(|line| line.split(':').nth(1).unwrap_or("0").trim()).unwrap_or("0");
+                    let end_val = output.iter().find(|line| line.starts_with("end:")).map(|line| line.split(':').nth(1).unwrap_or("end").trim()).unwrap_or("end");
+                    caption = format!("{} [{}..{}]", caption, start_val, end_val);
+                }
+            }
+
+            // Send the tool output with initial header
+            self.tui.tool_output(tool_name, &caption, "");
+        }
+
+        self.current_tool_output.lock().unwrap().push(String::new());
+        self.current_tool_output
+            .lock()
+            .unwrap()
+            .push("Output:".to_string());
+    }
+
+    fn update_tool_output_line(&self, line: &str) {
+        // For retro mode, we'll just add to the output buffer
+        self.current_tool_output
+            .lock()
+            .unwrap()
+            .push(line.to_string());
+    }
+
+    fn print_tool_output_line(&self, line: &str) {
+        self.current_tool_output
+            .lock()
+            .unwrap()
+            .push(line.to_string());
+    }
+
+    fn print_tool_output_summary(&self, hidden_count: usize) {
+        self.current_tool_output.lock().unwrap().push(format!(
+            "... ({} more line{})",
+            hidden_count,
+            if hidden_count == 1 { "" } else { "s" }
+        ));
+    }
+
+    fn print_tool_timing(&self, duration_str: &str) {
+        self.current_tool_output
+            .lock()
+            .unwrap()
+            .push(format!("⚡️ {}", duration_str));
+
+        // Calculate the actual duration
+        let duration_ms = if let Some(start) = *self.current_tool_start.lock().unwrap() {
+            start.elapsed().as_millis()
+        } else {
+            0
+        };
+
+        // Get the tool name and caption
+        if let Some(tool_name) = self.current_tool_name.lock().unwrap().as_ref() {
+            let content = self.current_tool_output.lock().unwrap().join("\n");
+            let caption = self.current_tool_caption.lock().unwrap().clone();
+            let caption = if caption.is_empty() {
+                "Completed".to_string()
+            } else {
+                caption
+            };
+
+            // Update the tool detail panel with the complete output without adding a new header
+            // This keeps the original header in place to be updated by tool_complete
+            self.tui.update_tool_detail(tool_name, &content);
+
+            // Determine success based on whether there's an error in the output
+            // This is a simple heuristic - you might want to make this more sophisticated
+            let success = !content.contains("error")
+                && !content.contains("Error")
+                && !content.contains("ERROR");
+
+            // Send the completion status to update the header
+            self.tui
+                .tool_complete(tool_name, success, duration_ms, &caption);
+        }
+
+        // Clear the buffers
+        *self.current_tool_name.lock().unwrap() = None;
+        self.current_tool_output.lock().unwrap().clear();
+        *self.current_tool_start.lock().unwrap() = None;
+        *self.current_tool_caption.lock().unwrap() = String::new();
+    }
+
+    fn print_agent_prompt(&self) {
+        self.tui.output("\n💬 ");
+    }
+
+    fn print_agent_response(&self, content: &str) {
+        self.tui.output(content);
+    }
+
+    fn notify_sse_received(&self) {
+        // Notify the TUI that an SSE was received
+        self.tui.sse_received();
+    }
+
+    fn flush(&self) {
+        // No-op for TUI since it handles its own rendering
+    }
+}
--- a/crates/g3-computer-control/Cargo.toml
+++ b/crates/g3-computer-control/Cargo.toml
@@ -0,0 +1,46 @@
+[package]
+name = "g3-computer-control"
+version = "0.1.0"
+edition = "2021"
+
+[dependencies]
+# Workspace dependencies
+tokio = { workspace = true }
+anyhow = { workspace = true }
+thiserror = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+uuid = { workspace = true }
+
+shellexpand = "3.1"
+# Async trait support
+async-trait = "0.1"
+
+# WebDriver support
+fantoccini = "0.21"
+
+# OCR dependencies
+tesseract = "0.14"
+
+# macOS dependencies
+[target.'cfg(target_os = "macos")'.dependencies]
+core-graphics = "0.23"
+core-foundation = "0.9"
+cocoa = "0.25"
+objc = "0.2"
+image = "0.24"
+
+# Linux dependencies
+[target.'cfg(target_os = "linux")'.dependencies]
+x11 = { version = "2.21", features = ["xlib", "xtest"] }
+image = "0.24"
+
+# Windows dependencies
+[target.'cfg(target_os = "windows")'.dependencies]
+windows = { version = "0.52", features = [
+    "Win32_Foundation",
+    "Win32_UI_WindowsAndMessaging",
+    "Win32_UI_Input_KeyboardAndMouse",
+    "Win32_Graphics_Gdi",
+] }
--- a/crates/g3-computer-control/examples/debug_screenshot.rs
+++ b/crates/g3-computer-control/examples/debug_screenshot.rs
@@ -0,0 +1,46 @@
+use core_graphics::display::CGDisplay;
+
+fn main() {
+    let display = CGDisplay::main();
+    let image = display.image().expect("Failed to capture screen");
+    
+    println!("CGImage properties:");
+    println!("  Width: {}", image.width());
+    println!("  Height: {}", image.height());
+    println!("  Bits per component: {}", image.bits_per_component());
+    println!("  Bits per pixel: {}", image.bits_per_pixel());
+    println!("  Bytes per row: {}", image.bytes_per_row());
+    
+    let data = image.data();
+    let expected_size = image.width() * image.height() * 4;
+    println!("  Data length: {}", data.len());
+    println!("  Expected (w*h*4): {}", expected_size);
+    
+    // Check if there's padding in rows
+    let bytes_per_row = image.bytes_per_row();
+    let width = image.width();
+    let expected_bytes_per_row = width * 4;
+    println!("\nRow alignment:");
+    println!("  Actual bytes per row: {}", bytes_per_row);
+    println!("  Expected (width * 4): {}", expected_bytes_per_row);
+    println!("  Padding per row: {}", bytes_per_row - expected_bytes_per_row);
+    
+    // Sample some pixels from different locations
+    println!("\nFirst 3 pixels (raw bytes):");
+    for i in 0..3 {
+        let offset = i * 4;
+        println!("  Pixel {}: [{:3}, {:3}, {:3}, {:3}]", 
+                 i, data[offset], data[offset+1], data[offset+2], data[offset+3]);
+    }
+    
+    // Check a pixel from the middle
+    let mid_row = image.height() / 2;
+    let mid_col = image.width() / 2;
+    let mid_offset = (mid_row * bytes_per_row + mid_col * 4) as usize;
+    println!("\nMiddle pixel (row {}, col {}):", mid_row, mid_col);
+    println!("  Offset: {}", mid_offset);
+    if mid_offset + 3 < data.len() as usize {
+        println!("  Bytes: [{:3}, {:3}, {:3}, {:3}]", 
+                 data[mid_offset], data[mid_offset+1], data[mid_offset+2], data[mid_offset+3]);
+    }
+}
--- a/crates/g3-computer-control/examples/list_windows.rs
+++ b/crates/g3-computer-control/examples/list_windows.rs
@@ -0,0 +1,56 @@
+use core_graphics::window::{kCGWindowListOptionOnScreenOnly, kCGNullWindowID, CGWindowListCopyWindowInfo};
+use core_foundation::dictionary::CFDictionary;
+use core_foundation::string::CFString;
+use core_foundation::base::TCFType;
+
+fn main() {
+    println!("Listing all on-screen windows...");
+    println!("{:<10} {:<25} {}", "Window ID", "Owner", "Title");
+    println!("{}", "-".repeat(80));
+    
+    unsafe {
+        let window_list = CGWindowListCopyWindowInfo(
+            kCGWindowListOptionOnScreenOnly,
+            kCGNullWindowID
+        );
+        
+        let count = core_foundation::array::CFArray::<CFDictionary>::wrap_under_create_rule(window_list).len();
+        let array = core_foundation::array::CFArray::<CFDictionary>::wrap_under_create_rule(window_list);
+        
+        for i in 0..count {
+            let dict = array.get(i).unwrap();
+            
+            // Get window ID
+            let window_id_key = CFString::from_static_string("kCGWindowNumber");
+            let window_id: i64 = if let Some(value) = dict.find(window_id_key.as_concrete_TypeRef()) {
+                let num: core_foundation::number::CFNumber = TCFType::wrap_under_get_rule(*value as *const _);
+                num.to_i64().unwrap_or(0)
+            } else {
+                0
+            };
+            
+            // Get owner name
+            let owner_key = CFString::from_static_string("kCGWindowOwnerName");
+            let owner: String = if let Some(value) = dict.find(owner_key.as_concrete_TypeRef()) {
+                let s: CFString = TCFType::wrap_under_get_rule(*value as *const _);
+                s.to_string()
+            } else {
+                "Unknown".to_string()
+            };
+            
+            // Get window name/title
+            let name_key = CFString::from_static_string("kCGWindowName");
+            let title: String = if let Some(value) = dict.find(name_key.as_concrete_TypeRef()) {
+                let s: CFString = TCFType::wrap_under_get_rule(*value as *const _);
+                s.to_string()
+            } else {
+                "".to_string()
+            };
+            
+            // Filter for iTerm or show all
+            if owner.contains("iTerm") || owner.contains("Terminal") {
+                println!("{:<10} {:<25} {}", window_id, owner, title);
+            }
+        }
+    }
+}
--- a/crates/g3-computer-control/examples/safari_demo.rs
+++ b/crates/g3-computer-control/examples/safari_demo.rs
@@ -0,0 +1,64 @@
+use g3_computer_control::SafariDriver;
+use g3_computer_control::webdriver::WebDriverController;
+use anyhow::Result;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+    println!("Safari WebDriver Demo");
+    println!("=====================\n");
+    
+    println!("Make sure to:");
+    println!("1. Enable 'Allow Remote Automation' in Safari's Develop menu");
+    println!("2. Run: /usr/bin/safaridriver --enable");
+    println!("3. Start safaridriver in another terminal: safaridriver --port 4444\n");
+    
+    println!("Connecting to SafariDriver...");
+    let mut driver = SafariDriver::new().await?;
+    println!("✅ Connected!\n");
+    
+    // Navigate to a website
+    println!("Navigating to example.com...");
+    driver.navigate("https://example.com").await?;
+    println!("✅ Navigated\n");
+    
+    // Get page title
+    let title = driver.title().await?;
+    println!("Page title: {}\n", title);
+    
+    // Get current URL
+    let url = driver.current_url().await?;
+    println!("Current URL: {}\n", url);
+    
+    // Find an element
+    println!("Finding h1 element...");
+    let mut h1 = driver.find_element("h1").await?;
+    let h1_text = h1.text().await?;
+    println!("H1 text: {}\n", h1_text);
+    
+    // Find all paragraphs
+    println!("Finding all paragraphs...");
+    let paragraphs = driver.find_elements("p").await?;
+    println!("Found {} paragraphs\n", paragraphs.len());
+    
+    // Get page source
+    println!("Getting page source...");
+    let source = driver.page_source().await?;
+    println!("Page source length: {} bytes\n", source.len());
+    
+    // Execute JavaScript
+    println!("Executing JavaScript...");
+    let result = driver.execute_script("return document.title", vec![]).await?;
+    println!("JS result: {:?}\n", result);
+    
+    // Take a screenshot
+    println!("Taking screenshot...");
+    driver.screenshot("/tmp/safari_demo.png").await?;
+    println!("✅ Screenshot saved to /tmp/safari_demo.png\n");
+    
+    // Close the browser
+    println!("Closing browser...");
+    driver.quit().await?;
+    println!("✅ Done!");
+    
+    Ok(())
+}
--- a/crates/g3-computer-control/examples/test_permission_prompt.rs
+++ b/crates/g3-computer-control/examples/test_permission_prompt.rs
@@ -0,0 +1,21 @@
+use g3_computer_control::{create_controller, ComputerController};
+
+#[tokio::main]
+async fn main() {
+    println!("Testing screenshot with permission prompt...");
+    
+    let controller = create_controller().expect("Failed to create controller");
+    
+    match controller.take_screenshot("/tmp/test_with_prompt.png", None, None).await {
+        Ok(_) => {
+            println!("\n✅ Screenshot saved to /tmp/test_with_prompt.png");
+            println!("Opening screenshot...");
+            let _ = std::process::Command::new("open")
+                .arg("/tmp/test_with_prompt.png")
+                .spawn();
+        }
+        Err(e) => {
+            println!("❌ Screenshot failed: {}", e);
+        }
+    }
+}
--- a/crates/g3-computer-control/examples/test_screencapture_direct.rs
+++ b/crates/g3-computer-control/examples/test_screencapture_direct.rs
@@ -0,0 +1,39 @@
+use std::process::Command;
+
+fn main() {
+    let path = "/tmp/rust_screencapture_test.png";
+    
+    println!("Testing screencapture command from Rust...");
+    
+    let mut cmd = Command::new("screencapture");
+    cmd.arg("-x"); // No sound
+    cmd.arg(path);
+    
+    println!("Command: {:?}", cmd);
+    
+    match cmd.output() {
+        Ok(output) => {
+            println!("Exit status: {}", output.status);
+            println!("Stdout: {}", String::from_utf8_lossy(&output.stdout));
+            println!("Stderr: {}", String::from_utf8_lossy(&output.stderr));
+            
+            if output.status.success() {
+                println!("\n✅ Screenshot saved to: {}", path);
+                
+                // Check file exists and size
+                if let Ok(metadata) = std::fs::metadata(path) {
+                    println!("File size: {} bytes ({:.1} MB)", metadata.len(), metadata.len() as f64 / 1_000_000.0);
+                }
+                
+                // Open it
+                let _ = Command::new("open").arg(path).spawn();
+                println!("\nOpened screenshot - please verify it looks correct!");
+            } else {
+                println!("\n❌ Screenshot failed!");
+            }
+        }
+        Err(e) => {
+            println!("❌ Failed to execute screencapture: {}", e);
+        }
+    }
+}
--- a/crates/g3-computer-control/examples/test_screenshot_fix.rs
+++ b/crates/g3-computer-control/examples/test_screenshot_fix.rs
@@ -0,0 +1,69 @@
+use core_graphics::display::CGDisplay;
+use image::{ImageBuffer, RgbaImage};
+use std::path::Path;
+
+fn main() {
+    let display = CGDisplay::main();
+    let image = display.image().expect("Failed to capture screen");
+    
+    let width = image.width() as u32;
+    let height = image.height() as u32;
+    let bytes_per_row = image.bytes_per_row() as usize;
+    let data = image.data();
+    
+    println!("Testing screenshot fix...");
+    println!("Image: {}x{}, bytes_per_row: {}", width, height, bytes_per_row);
+    println!("Expected bytes per row: {}", width * 4);
+    println!("Padding per row: {} bytes", bytes_per_row - (width as usize * 4));
+    
+    // OLD METHOD (broken) - treating data as continuous
+    println!("\n=== OLD METHOD (BROKEN) ===");
+    let mut old_rgba = Vec::with_capacity(data.len() as usize);
+    for chunk in data.chunks_exact(4) {
+        old_rgba.push(chunk[2]); // R
+        old_rgba.push(chunk[1]); // G
+        old_rgba.push(chunk[0]); // B
+        old_rgba.push(chunk[3]); // A
+    }
+    println!("Converted {} pixels", old_rgba.len() / 4);
+    println!("Expected {} pixels", width * height);
+    
+    // NEW METHOD (fixed) - handling row padding
+    println!("\n=== NEW METHOD (FIXED) ===");
+    let mut new_rgba = Vec::with_capacity((width * height * 4) as usize);
+    for row in 0..height as usize {
+        let row_start = row * bytes_per_row;
+        let row_end = row_start + (width as usize * 4);
+        
+        for chunk in data[row_start..row_end].chunks_exact(4) {
+            new_rgba.push(chunk[2]); // R
+            new_rgba.push(chunk[1]); // G
+            new_rgba.push(chunk[0]); // B
+            new_rgba.push(chunk[3]); // A
+        }
+    }
+    println!("Converted {} pixels", new_rgba.len() / 4);
+    println!("Expected {} pixels", width * height);
+    
+    // Save a small crop from both methods
+    let crop_size = 200;
+    
+    // Old method crop
+    let old_crop: Vec<u8> = old_rgba.iter().take((crop_size * crop_size * 4) as usize).copied().collect();
+    if let Some(old_img) = ImageBuffer::from_raw(crop_size, crop_size, old_crop) {
+        let old_img: RgbaImage = old_img;
+        old_img.save("/tmp/screenshot_old_method.png").unwrap();
+        println!("\nSaved OLD method crop to: /tmp/screenshot_old_method.png");
+    }
+    
+    // New method crop
+    let new_crop: Vec<u8> = new_rgba.iter().take((crop_size * crop_size * 4) as usize).copied().collect();
+    if let Some(new_img) = ImageBuffer::from_raw(crop_size, crop_size, new_crop) {
+        let new_img: RgbaImage = new_img;
+        new_img.save("/tmp/screenshot_new_method.png").unwrap();
+        println!("Saved NEW method crop to: /tmp/screenshot_new_method.png");
+    }
+    
+    println!("\nOpen both images to compare:");
+    println!("  open /tmp/screenshot_old_method.png /tmp/screenshot_new_method.png");
+}
--- a/crates/g3-computer-control/examples/test_window_capture.rs
+++ b/crates/g3-computer-control/examples/test_window_capture.rs
@@ -0,0 +1,45 @@
+use g3_computer_control::create_controller;
+
+#[tokio::main]
+async fn main() {
+    println!("Testing window-specific screenshot capture...");
+    
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Test 1: Capture iTerm2 window
+    println!("\n1. Capturing iTerm2 window...");
+    match controller.take_screenshot("/tmp/iterm_window.png", None, Some("iTerm2")).await {
+        Ok(_) => {
+            println!("   ✅ iTerm2 window captured to /tmp/iterm_window.png");
+            let _ = std::process::Command::new("open").arg("/tmp/iterm_window.png").spawn();
+        }
+        Err(e) => println!("   ❌ Failed: {}", e),
+    }
+    
+    // Wait a moment for the image to open
+    tokio::time::sleep(tokio::time::Duration::from_secs(2)).await;
+    
+    // Test 2: Full screen capture for comparison
+    println!("\n2. Capturing full screen for comparison...");
+    match controller.take_screenshot("/tmp/fullscreen.png", None, None).await {
+        Ok(_) => {
+            println!("   ✅ Full screen captured to /tmp/fullscreen.png");
+            let _ = std::process::Command::new("open").arg("/tmp/fullscreen.png").spawn();
+        }
+        Err(e) => println!("   ❌ Failed: {}", e),
+    }
+    
+    println!("\n=== Comparison ===");
+    println!("iTerm window:  /tmp/iterm_window.png (should show ONLY iTerm window)");
+    println!("Full screen:   /tmp/fullscreen.png (should show entire desktop)");
+    
+    // Show file sizes
+    if let Ok(meta1) = std::fs::metadata("/tmp/iterm_window.png") {
+        if let Ok(meta2) = std::fs::metadata("/tmp/fullscreen.png") {
+            println!("\nFile sizes:");
+            println!("  iTerm window: {:.1} MB", meta1.len() as f64 / 1_000_000.0);
+            println!("  Full screen:  {:.1} MB", meta2.len() as f64 / 1_000_000.0);
+            println!("\nWindow capture should be smaller than full screen.");
+        }
+    }
+}
--- a/crates/g3-computer-control/src/lib.rs
+++ b/crates/g3-computer-control/src/lib.rs
@@ -0,0 +1,35 @@
+pub mod types;
+pub mod platform;
+pub mod webdriver;
+
+// Re-export webdriver types for convenience
+pub use webdriver::{WebDriverController, WebElement, safari::SafariDriver};
+
+use anyhow::Result;
+use async_trait::async_trait;
+use types::*;
+
+#[async_trait]
+pub trait ComputerController: Send + Sync {
+    // Screen capture
+    async fn take_screenshot(&self, path: &str, region: Option<Rect>, window_id: Option<&str>) -> Result<()>;
+    
+    // OCR operations
+    async fn extract_text_from_screen(&self, region: Rect) -> Result<String>;
+    async fn extract_text_from_image(&self, path: &str) -> Result<String>;
+}
+
+// Platform-specific constructor
+pub fn create_controller() -> Result<Box<dyn ComputerController>> {
+    #[cfg(target_os = "macos")]
+    return Ok(Box::new(platform::macos::MacOSController::new()?));
+    
+    #[cfg(target_os = "linux")]
+    return Ok(Box::new(platform::linux::LinuxController::new()?));
+    
+    #[cfg(target_os = "windows")]
+    return Ok(Box::new(platform::windows::WindowsController::new()?));
+    
+    #[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
+    anyhow::bail!("Unsupported platform")
+}
--- a/crates/g3-computer-control/src/platform/linux.rs
+++ b/crates/g3-computer-control/src/platform/linux.rs
@@ -0,0 +1,161 @@
+use crate::{ComputerController, types::*};
+use anyhow::Result;
+use async_trait::async_trait;
+use tesseract::Tesseract;
+use uuid::Uuid;
+
+pub struct LinuxController {
+    // Placeholder for X11 connection or other state
+}
+
+impl LinuxController {
+    pub fn new() -> Result<Self> {
+        // Initialize X11 connection
+        tracing::warn!("Linux computer control not fully implemented");
+        Ok(Self {})
+    }
+}
+
+#[async_trait]
+impl ComputerController for LinuxController {
+    async fn move_mouse(&self, _x: i32, _y: i32) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn double_click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn type_text(&self, _text: &str) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn press_key(&self, _key: &str) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn list_windows(&self) -> Result<Vec<Window>> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn focus_window(&self, _window_id: &str) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn take_screenshot(&self, _path: &str, _region: Option<Rect>, _window_id: Option<&str>) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn extract_text_from_screen(&self, _region: Rect) -> Result<OCRResult> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  \
+                Ubuntu/Debian: sudo apt-get install tesseract-ocr\n  \
+                RHEL/CentOS:   sudo yum install tesseract\n  \
+                Arch Linux:    sudo pacman -S tesseract\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    Ubuntu/Debian: sudo apt-get install tesseract-ocr-eng\n  \
+                    RHEL/CentOS:   sudo yum install tesseract-langpack-eng\n  \
+                    Arch Linux:    sudo pacman -S tesseract-data-eng", e)
+            })?;
+        
+        let text = tess.set_image(_path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        // Get confidence (simplified - would need more complex API calls for per-word confidence)
+        let confidence = 0.85; // Placeholder
+        
+        Ok(OCRResult {
+            text,
+            confidence,
+            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
+        })
+    }
+    
+    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  \
+                Ubuntu/Debian: sudo apt-get install tesseract-ocr\n  \
+                RHEL/CentOS:   sudo yum install tesseract\n  \
+                Arch Linux:    sudo pacman -S tesseract\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Take full screen screenshot
+        let temp_path = format!("/tmp/g3_ocr_search_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, None, None).await?;
+        
+        // Use Tesseract to find text with bounding boxes
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    Ubuntu/Debian: sudo apt-get install tesseract-ocr-eng\n  \
+                    RHEL/CentOS:   sudo yum install tesseract-langpack-eng\n  \
+                    Arch Linux:    sudo pacman -S tesseract-data-eng", e)
+            })?;
+        
+        let full_text = tess.set_image(temp_path.as_str())
+            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        // Simple text search - full implementation would use get_component_images
+        // to get bounding boxes for each word
+        if full_text.contains(_text) {
+            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
+            Ok(Some(Point { x: 0, y: 0 }))
+        } else {
+            Ok(None)
+        }
+    }
+}
--- a/crates/g3-computer-control/src/platform/macos.rs
+++ b/crates/g3-computer-control/src/platform/macos.rs
@@ -0,0 +1,125 @@
+use crate::{ComputerController, types::Rect};
+use anyhow::Result;
+use async_trait::async_trait;
+use std::path::Path;
+use tesseract::Tesseract;
+
+pub struct MacOSController {
+    // Empty struct for now
+}
+
+impl MacOSController {
+    pub fn new() -> Result<Self> {
+        Ok(Self {})
+    }
+}
+
+#[async_trait]
+impl ComputerController for MacOSController {
+    async fn take_screenshot(&self, path: &str, region: Option<Rect>, window_id: Option<&str>) -> Result<()> {
+        // Determine the temporary directory for screenshots
+        let temp_dir = std::env::var("TMPDIR")
+            .or_else(|_| std::env::var("HOME").map(|h| format!("{}/tmp", h)))
+            .unwrap_or_else(|_| "/tmp".to_string());
+        
+        // Ensure temp directory exists
+        std::fs::create_dir_all(&temp_dir)?;
+        
+        // If path is relative or doesn't specify a directory, use temp_dir
+        let final_path = if path.starts_with('/') {
+            path.to_string()
+        } else {
+            format!("{}/{}", temp_dir.trim_end_matches('/'), path)
+        };
+        
+        let path_obj = Path::new(&final_path);
+        if let Some(parent) = path_obj.parent() {
+            std::fs::create_dir_all(parent)?;
+        }
+        
+        let mut cmd = std::process::Command::new("screencapture");
+        
+        // Add flags
+        cmd.arg("-x"); // No sound
+        
+        if let Some(region) = region {
+            // Capture specific region: -R x,y,width,height
+            cmd.arg("-R");
+            cmd.arg(format!("{},{},{},{}", region.x, region.y, region.width, region.height));
+        }
+        
+        if let Some(app_name) = window_id {
+            // Capture specific window by app name
+            // Use AppleScript to get window ID
+            let script = format!(r#"tell application "{}" to id of window 1"#, app_name);
+            let output = std::process::Command::new("osascript")
+                .arg("-e")
+                .arg(&script)
+                .output()?;
+            
+            if output.status.success() {
+                let window_id_str = String::from_utf8_lossy(&output.stdout).trim().to_string();
+                cmd.arg(format!("-l{}", window_id_str));
+            }
+        }
+        
+        cmd.arg(&final_path);
+        
+        let screenshot_result = cmd.output()?;
+        
+        if !screenshot_result.status.success() {
+            let stderr = String::from_utf8_lossy(&screenshot_result.stderr);
+            return Err(anyhow::anyhow!("screencapture failed: {}", stderr));
+        }
+        
+        Ok(())
+    }
+    
+    async fn extract_text_from_screen(&self, region: Rect) -> Result<String> {
+        // Take screenshot of region first
+        let temp_path = format!("/tmp/g3_ocr_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, Some(region), None).await?;
+        
+        // Extract text from the screenshot
+        let result = self.extract_text_from_image(&temp_path).await?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        Ok(result)
+    }
+    
+    async fn extract_text_from_image(&self, path: &str) -> Result<String> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  macOS:   brew install tesseract\n  \
+                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
+                sudo yum install tesseract (RHEL/CentOS)\n  \
+                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    macOS:   brew reinstall tesseract\n  \
+                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
+                    Windows: Reinstall tesseract and ensure language files are included", e)
+            })?;
+        
+        let text = tess.set_image(path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        Ok(text)
+    }
+}
--- a/crates/g3-computer-control/src/platform/macos.rs.bak
+++ b/crates/g3-computer-control/src/platform/macos.rs.bak
@@ -0,0 +1,425 @@
+use crate::{ComputerController, types::*};
+use anyhow::Result;
+use async_trait::async_trait;
+use core_graphics::display::CGPoint;
+use core_graphics::event::{CGEvent, CGEventType, CGMouseButton, CGEventTapLocation};
+use core_graphics::event_source::{CGEventSource, CGEventSourceStateID};
+use std::path::Path;
+use tesseract::Tesseract;
+
+// MacOSController doesn't store CGEventSource to avoid Send/Sync issues
+// We create it fresh for each operation
+pub struct MacOSController {
+    // Empty struct - event source created per operation
+}
+
+impl MacOSController {
+    pub fn new() -> Result<Self> {
+        // Test that we can create an event source
+        let _event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+            .map_err(|_| anyhow::anyhow!("Failed to create event source. Make sure Accessibility permissions are granted."))?;
+        Ok(Self {})
+    }
+    
+    fn key_to_keycode(&self, key: &str) -> Result<u16> {
+        // Map key names to macOS keycodes
+        let keycode = match key.to_lowercase().as_str() {
+            "return" | "enter" => 36,
+            "tab" => 48,
+            "space" => 49,
+            "delete" | "backspace" => 51,
+            "escape" | "esc" => 53,
+            "command" | "cmd" => 55,
+            "shift" => 56,
+            "capslock" => 57,
+            "option" | "alt" => 58,
+            "control" | "ctrl" => 59,
+            "left" => 123,
+            "right" => 124,
+            "down" => 125,
+            "up" => 126,
+            _ => anyhow::bail!("Unknown key: {}", key),
+        };
+        Ok(keycode)
+    }
+}
+
+#[async_trait]
+impl ComputerController for MacOSController {
+    async fn move_mouse(&self, x: i32, y: i32) -> Result<()> {
+        let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+            .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+        let point = CGPoint::new(x as f64, y as f64);
+        let event = CGEvent::new_mouse_event(
+            event_source,
+            CGEventType::MouseMoved,
+            point,
+            CGMouseButton::Left,
+        ).map_err(|_| anyhow::anyhow!("Failed to create mouse move event"))?;
+        
+        event.post(CGEventTapLocation::HID);
+        Ok(())
+    }
+    
+    async fn click(&self, button: MouseButton) -> Result<()> {
+        let (cg_button, down_type, up_type) = match button {
+            MouseButton::Left => (CGMouseButton::Left, CGEventType::LeftMouseDown, CGEventType::LeftMouseUp),
+            MouseButton::Right => (CGMouseButton::Right, CGEventType::RightMouseDown, CGEventType::RightMouseUp),
+            MouseButton::Middle => (CGMouseButton::Center, CGEventType::OtherMouseDown, CGEventType::OtherMouseUp),
+        };
+        
+        let point = {
+            // Get current mouse position
+            let temp_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            let event = CGEvent::new(temp_source)
+                .map_err(|_| anyhow::anyhow!("Failed to get mouse position"))?;
+            let p = event.location();
+            p
+        };
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            // Mouse down
+            let down_event = CGEvent::new_mouse_event(
+                event_source,
+                down_type,
+                point,
+                cg_button,
+            ).map_err(|_| anyhow::anyhow!("Failed to create mouse down event"))?;
+            down_event.post(CGEventTapLocation::HID);
+        } // event_source and down_event dropped here
+        
+        // Small delay
+        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            let up_event = CGEvent::new_mouse_event(
+                event_source,
+                up_type,
+                point,
+                cg_button,
+            ).map_err(|_| anyhow::anyhow!("Failed to create mouse up event"))?;
+            up_event.post(CGEventTapLocation::HID);
+        } // event_source and up_event dropped here
+        
+        Ok(())
+    }
+    
+    async fn double_click(&self, button: MouseButton) -> Result<()> {
+        self.click(button).await?;
+        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
+        self.click(button).await?;
+        Ok(())
+    }
+    
+    async fn type_text(&self, text: &str) -> Result<()> {
+        for ch in text.chars() {
+            {
+                let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                    .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+                
+                // Create keyboard event for character
+                let event = CGEvent::new_keyboard_event(
+                    event_source,
+                    0, // keycode (0 for unicode)
+                    true,
+                ).map_err(|_| anyhow::anyhow!("Failed to create keyboard event"))?;
+                
+                // Set unicode string
+                let mut utf16_buf = [0u16; 2];
+                let utf16_slice = ch.encode_utf16(&mut utf16_buf);
+                let utf16_chars: Vec<u16> = utf16_slice.iter().copied().collect();
+                
+                event.set_string_from_utf16_unchecked(utf16_chars.as_slice());
+                event.post(CGEventTapLocation::HID);
+            } // event_source and event dropped here
+            
+            tokio::time::sleep(tokio::time::Duration::from_millis(10)).await;
+        }
+        Ok(())
+    }
+    
+    async fn press_key(&self, key: &str) -> Result<()> {
+        let keycode = self.key_to_keycode(key)?;
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            // Key down
+            let down_event = CGEvent::new_keyboard_event(
+                event_source,
+                keycode,
+                true,
+            ).map_err(|_| anyhow::anyhow!("Failed to create key down event"))?;
+            down_event.post(CGEventTapLocation::HID);
+        } // event_source and down_event dropped here
+        
+        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            // Key up
+            let up_event = CGEvent::new_keyboard_event(
+                event_source,
+                keycode,
+                false,
+            ).map_err(|_| anyhow::anyhow!("Failed to create key up event"))?;
+            up_event.post(CGEventTapLocation::HID);
+        } // event_source and up_event dropped here
+        
+        Ok(())
+    }
+    
+    async fn list_windows(&self) -> Result<Vec<Window>> {
+        // Note: Full implementation would use CGWindowListCopyWindowInfo
+        // For now, return empty list as this requires more complex FFI
+        tracing::warn!("list_windows not fully implemented on macOS");
+        Ok(vec![])
+    }
+    
+    async fn focus_window(&self, _window_id: &str) -> Result<()> {
+        // Note: Full implementation would use NSWorkspace to activate application
+        tracing::warn!("focus_window not fully implemented on macOS");
+        Ok(())
+    }
+    
+    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
+        // Note: Full implementation would use Accessibility API
+        tracing::warn!("get_window_bounds not fully implemented on macOS");
+        Ok(Rect { x: 0, y: 0, width: 800, height: 600 })
+    }
+    
+    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
+        // Note: Full implementation would use macOS Accessibility API
+        tracing::warn!("find_element not fully implemented on macOS");
+        Ok(None)
+    }
+    
+    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
+        // Note: Full implementation would use Accessibility API
+        tracing::warn!("get_element_text not fully implemented on macOS");
+        Ok(String::new())
+    }
+    
+    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
+        // Note: Full implementation would use Accessibility API
+        tracing::warn!("get_element_bounds not fully implemented on macOS");
+        Ok(Rect { x: 0, y: 0, width: 100, height: 30 })
+    }
+    
+    async fn take_screenshot(&self, path: &str, _region: Option<Rect>, window_id: Option<&str>) -> Result<()> {
+        // Use native macOS screencapture command which handles all the format complexities
+        
+        // Check if we have Screen Recording permission by attempting a test capture
+        // If we only get wallpaper/menubar but no windows, we need permission
+        let needs_permission_check = std::env::var("G3_SKIP_PERMISSION_CHECK").is_err();
+        
+        if needs_permission_check {
+            // Try to open Screen Recording settings if this is the first screenshot
+            static PERMISSION_PROMPTED: std::sync::atomic::AtomicBool = std::sync::atomic::AtomicBool::new(false);
+            
+            if !PERMISSION_PROMPTED.swap(true, std::sync::atomic::Ordering::Relaxed) {
+                tracing::warn!("\n=== Screen Recording Permission Required ===\n\
+                    macOS requires explicit permission to capture window content.\n\
+                    If screenshots only show wallpaper/menubar (no windows):\n\n\
+                    1. Open System Settings > Privacy & Security > Screen Recording\n\
+                    2. Enable permission for your terminal (iTerm/Terminal) or g3\n\
+                    3. Restart your terminal if needed\n\n\
+                    Opening Screen Recording settings now...\n");
+                
+                // Try to open the settings (non-blocking)
+                let _ = std::process::Command::new("open")
+                    .arg("x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture")
+                    .spawn();
+            }
+        }
+        
+        let path_obj = Path::new(path);
+        if let Some(parent) = path_obj.parent() {
+            std::fs::create_dir_all(parent)?;
+        }
+        
+        let mut cmd = std::process::Command::new("screencapture");
+        
+        // Add flags
+        cmd.arg("-x"); // No sound
+        
+        if let Some(window_id) = window_id {
+            // Capture specific window by getting its bounds and using region capture
+            // window_id format: "AppName" or "AppName:WindowTitle"
+            let app_name = window_id.split(':').next().unwrap_or(window_id);
+            
+            // Use AppleScript to get window bounds
+            let script = format!(
+                r#"tell application "{}"
+                    tell current window
+                        get bounds
+                    end tell
+                end tell"#,
+                app_name
+            );
+            
+            let output = std::process::Command::new("osascript")
+                .arg("-e")
+                .arg(&script)
+                .output()
+                .map_err(|e| anyhow::anyhow!("Failed to get window bounds: {}", e))?;
+            
+            if output.status.success() {
+                let bounds_str = String::from_utf8_lossy(&output.stdout);
+                let bounds: Vec<i32> = bounds_str
+                    .trim()
+                    .split(',')
+                    .filter_map(|s| s.trim().parse().ok())
+                    .collect();
+                
+                if bounds.len() == 4 {
+                    let (left, top, right, bottom) = (bounds[0], bounds[1], bounds[2], bounds[3]);
+                    let width = right - left;
+                    let height = bottom - top;
+                    
+                    cmd.arg("-R");
+                    cmd.arg(format!("{},{},{},{}", left, top, width, height));
+                    
+                    tracing::debug!("Capturing window '{}' at region: {},{} {}x{}", app_name, left, top, width, height);
+                } else {
+                    tracing::warn!("Failed to parse window bounds, capturing full screen");
+                }
+            } else {
+                tracing::warn!("Failed to get window bounds for '{}', capturing full screen", app_name);
+            }
+        } else if let Some(region) = _region {
+            // Capture specific region: -R x,y,width,height
+            cmd.arg("-R");
+            cmd.arg(format!("{},{},{},{}", region.x, region.y, region.width, region.height));
+        }
+        
+        cmd.arg(path);
+        
+        let output = cmd.output()
+            .map_err(|e| anyhow::anyhow!("Failed to execute screencapture: {}", e))?;
+        
+        if !output.status.success() {
+            let stderr = String::from_utf8_lossy(&output.stderr);
+            anyhow::bail!("screencapture failed: {}", stderr);
+        }
+        
+        tracing::debug!("Screenshot saved using screencapture: {}", path);
+        
+        Ok(())
+    }
+    
+    }
+    
+    async fn extract_text_from_screen(&self, region: Rect) -> Result<OCRResult> {
+        // Take screenshot of region first
+        let temp_path = format!("/tmp/g3_ocr_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, Some(region), None).await?;
+        
+        // Extract text from the screenshot
+        let result = self.extract_text_from_image(&temp_path).await?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        Ok(result)
+    }
+    
+    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  macOS:   brew install tesseract\n  \
+                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
+                sudo yum install tesseract (RHEL/CentOS)\n  \
+                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    macOS:   brew reinstall tesseract\n  \
+                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
+                    Windows: Reinstall tesseract and ensure language files are included", e)
+            })?;
+        
+        let text = tess.set_image(_path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        // Get confidence (simplified - would need more complex API calls for per-word confidence)
+        let confidence = 0.85; // Placeholder
+        
+        Ok(OCRResult {
+            text,
+            confidence,
+            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
+        })
+    }
+    
+    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  macOS:   brew install tesseract\n  \
+                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
+                sudo yum install tesseract (RHEL/CentOS)\n  \
+                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Take full screen screenshot
+        let temp_path = format!("/tmp/g3_ocr_search_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, None, None).await?;
+        
+        // Use Tesseract to find text with bounding boxes
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    macOS:   brew reinstall tesseract\n  \
+                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
+                    Windows: Reinstall tesseract and ensure language files are included", e)
+            })?;
+        
+        let full_text = tess.set_image(temp_path.as_str())
+            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        // Simple text search - full implementation would use get_component_images
+        // to get bounding boxes for each word
+        if full_text.contains(_text) {
+            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
+            Ok(Some(Point { x: 0, y: 0 }))
+        } else {
+            Ok(None)
+        }
+    }
+}
--- a/crates/g3-computer-control/src/platform/mod.rs
+++ b/crates/g3-computer-control/src/platform/mod.rs
@@ -0,0 +1,8 @@
+#[cfg(target_os = "macos")]
+pub mod macos;
+
+#[cfg(target_os = "linux")]
+pub mod linux;
+
+#[cfg(target_os = "windows")]
+pub mod windows;
--- a/crates/g3-computer-control/src/platform/windows.rs
+++ b/crates/g3-computer-control/src/platform/windows.rs
@@ -0,0 +1,162 @@
+use crate::{ComputerController, types::*};
+use anyhow::Result;
+use async_trait::async_trait;
+use tesseract::Tesseract;
+use uuid::Uuid;
+
+pub struct WindowsController {
+    // Placeholder for Windows-specific state
+}
+
+impl WindowsController {
+    pub fn new() -> Result<Self> {
+        tracing::warn!("Windows computer control not fully implemented");
+        Ok(Self {})
+    }
+}
+
+#[async_trait]
+impl ComputerController for WindowsController {
+    async fn move_mouse(&self, _x: i32, _y: i32) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn double_click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn type_text(&self, _text: &str) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn press_key(&self, _key: &str) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn list_windows(&self) -> Result<Vec<Window>> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn focus_window(&self, _window_id: &str) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn take_screenshot(&self, _path: &str, _region: Option<Rect>, _window_id: Option<&str>) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn extract_text_from_screen(&self, _region: Rect) -> Result<OCRResult> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("where")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract on Windows:\n  \
+                1. Download the installer from: https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                2. Run the installer and follow the instructions\n  \
+                3. Add tesseract to your PATH environment variable\n  \
+                4. Restart your terminal/command prompt\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    1. Reinstall tesseract from https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                    2. Make sure to select 'Additional language data' during installation\n  \
+                    3. Ensure tesseract is in your PATH", e)
+            })?;
+        
+        let text = tess.set_image(_path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        // Get confidence (simplified - would need more complex API calls for per-word confidence)
+        let confidence = 0.85; // Placeholder
+        
+        Ok(OCRResult {
+            text,
+            confidence,
+            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
+        })
+    }
+    
+    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("where")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract on Windows:\n  \
+                1. Download the installer from: https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                2. Run the installer and follow the instructions\n  \
+                3. Add tesseract to your PATH environment variable\n  \
+                4. Restart your terminal/command prompt\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Take full screen screenshot
+        let temp_path = format!("C:\\\\Temp\\\\g3_ocr_search_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, None, None).await?;
+        
+        // Use Tesseract to find text with bounding boxes
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    1. Reinstall tesseract from https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                    2. Make sure to select 'Additional language data' during installation\n  \
+                    3. Ensure tesseract is in your PATH", e)
+            })?;
+        
+        let full_text = tess.set_image(temp_path.as_str())
+            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        // Simple text search - full implementation would use get_component_images
+        // to get bounding boxes for each word
+        if full_text.contains(_text) {
+            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
+            Ok(Some(Point { x: 0, y: 0 }))
+        } else {
+            Ok(None)
+        }
+    }
+}
--- a/crates/g3-computer-control/src/types.rs
+++ b/crates/g3-computer-control/src/types.rs
@@ -0,0 +1,9 @@
+use serde::{Deserialize, Serialize};
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
+pub struct Rect {
+    pub x: i32,
+    pub y: i32,
+    pub width: i32,
+    pub height: i32,
+}
--- a/crates/g3-computer-control/src/webdriver/mod.rs
+++ b/crates/g3-computer-control/src/webdriver/mod.rs
@@ -0,0 +1,111 @@
+pub mod safari;
+
+use anyhow::Result;
+use async_trait::async_trait;
+use serde_json::Value;
+
+/// WebDriver controller for browser automation
+#[async_trait]
+pub trait WebDriverController: Send + Sync {
+    /// Navigate to a URL
+    async fn navigate(&mut self, url: &str) -> Result<()>;
+    
+    /// Get the current URL
+    async fn current_url(&self) -> Result<String>;
+    
+    /// Get the page title
+    async fn title(&self) -> Result<String>;
+    
+    /// Find an element by CSS selector
+    async fn find_element(&mut self, selector: &str) -> Result<WebElement>;
+    
+    /// Find multiple elements by CSS selector
+    async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>>;
+    
+    /// Execute JavaScript in the browser
+    async fn execute_script(&mut self, script: &str, args: Vec<Value>) -> Result<Value>;
+    
+    /// Get the page source (HTML)
+    async fn page_source(&self) -> Result<String>;
+    
+    /// Take a screenshot and save to path
+    async fn screenshot(&mut self, path: &str) -> Result<()>;
+    
+    /// Close the current window/tab
+    async fn close(&mut self) -> Result<()>;
+    
+    /// Quit the browser session
+    async fn quit(self) -> Result<()>;
+}
+
+/// Represents a web element in the DOM
+pub struct WebElement {
+    pub(crate) inner: fantoccini::elements::Element,
+}
+
+impl WebElement {
+    /// Click the element
+    pub async fn click(&mut self) -> Result<()> {
+        self.inner.click().await?;
+        Ok(())
+    }
+    
+    /// Send keys/text to the element
+    pub async fn send_keys(&mut self, text: &str) -> Result<()> {
+        self.inner.send_keys(text).await?;
+        Ok(())
+    }
+    
+    /// Clear the element's content (for input fields)
+    pub async fn clear(&mut self) -> Result<()> {
+        self.inner.clear().await?;
+        Ok(())
+    }
+    
+    /// Get the element's text content
+    pub async fn text(&self) -> Result<String> {
+        Ok(self.inner.text().await?)
+    }
+    
+    /// Get an attribute value
+    pub async fn attr(&self, name: &str) -> Result<Option<String>> {
+        Ok(self.inner.attr(name).await?)
+    }
+    
+    /// Get a property value
+    pub async fn prop(&self, name: &str) -> Result<Option<String>> {
+        Ok(self.inner.prop(name).await?)
+    }
+    
+    /// Get the element's HTML
+    pub async fn html(&self, inner: bool) -> Result<String> {
+        Ok(self.inner.html(inner).await?)
+    }
+    
+    /// Check if element is displayed
+    pub async fn is_displayed(&self) -> Result<bool> {
+        Ok(self.inner.is_displayed().await?)
+    }
+    
+    /// Check if element is enabled
+    pub async fn is_enabled(&self) -> Result<bool> {
+        Ok(self.inner.is_enabled().await?)
+    }
+    
+    /// Check if element is selected (for checkboxes/radio buttons)
+    pub async fn is_selected(&self) -> Result<bool> {
+        Ok(self.inner.is_selected().await?)
+    }
+    
+    /// Find a child element by CSS selector
+    pub async fn find_element(&mut self, selector: &str) -> Result<WebElement> {
+        let elem = self.inner.find(fantoccini::Locator::Css(selector)).await?;
+        Ok(WebElement { inner: elem })
+    }
+    
+    /// Find multiple child elements by CSS selector
+    pub async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>> {
+        let elems = self.inner.find_all(fantoccini::Locator::Css(selector)).await?;
+        Ok(elems.into_iter().map(|inner| WebElement { inner }).collect())
+    }
+}
--- a/crates/g3-computer-control/src/webdriver/safari.rs
+++ b/crates/g3-computer-control/src/webdriver/safari.rs
@@ -0,0 +1,212 @@
+use super::{WebDriverController, WebElement};
+use anyhow::{Context, Result};
+use async_trait::async_trait;
+use fantoccini::{Client, ClientBuilder};
+use serde_json::Value;
+use std::time::Duration;
+
+/// SafariDriver WebDriver controller
+pub struct SafariDriver {
+    client: Client,
+}
+
+impl SafariDriver {
+    /// Create a new SafariDriver instance
+    /// 
+    /// This will connect to SafariDriver running on the default port (4444).
+    /// Make sure to enable "Allow Remote Automation" in Safari's Develop menu first.
+    /// 
+    /// You can start SafariDriver manually with:
+    /// ```bash
+    /// /usr/bin/safaridriver --enable
+    /// ```
+    pub async fn new() -> Result<Self> {
+        Self::with_port(4444).await
+    }
+    
+    /// Create a new SafariDriver instance with a custom port
+    pub async fn with_port(port: u16) -> Result<Self> {
+        let url = format!("http://localhost:{}", port);
+        
+        let mut caps = serde_json::Map::new();
+        caps.insert("browserName".to_string(), Value::String("safari".to_string()));
+        
+        let client = ClientBuilder::native()
+            .capabilities(caps)
+            .connect(&url)
+            .await
+            .context("Failed to connect to SafariDriver. Make sure SafariDriver is running and 'Allow Remote Automation' is enabled in Safari's Develop menu.")?;
+        
+        Ok(Self { client })
+    }
+    
+    /// Go back in browser history
+    pub async fn back(&mut self) -> Result<()> {
+        self.client.back().await?;
+        Ok(())
+    }
+    
+    /// Go forward in browser history
+    pub async fn forward(&mut self) -> Result<()> {
+        self.client.forward().await?;
+        Ok(())
+    }
+    
+    /// Refresh the current page
+    pub async fn refresh(&mut self) -> Result<()> {
+        self.client.refresh().await?;
+        Ok(())
+    }
+    
+    /// Get all window handles
+    pub async fn window_handles(&mut self) -> Result<Vec<String>> {
+        let handles = self.client.windows().await?;
+        Ok(handles.into_iter()
+            .map(|h| h.into())
+            .collect())
+    }
+    
+    /// Switch to a window by handle
+    pub async fn switch_to_window(&mut self, handle: &str) -> Result<()> {
+        let window_handle: fantoccini::wd::WindowHandle = handle.to_string().try_into()?;
+        self.client.switch_to_window(window_handle).await?;
+        Ok(())
+    }
+    
+    /// Get the current window handle
+    pub async fn current_window_handle(&mut self) -> Result<String> {
+        Ok(self.client.window().await?.into())
+    }
+    
+    /// Close the current window
+    pub async fn close_window(&mut self) -> Result<()> {
+        self.client.close_window().await?;
+        Ok(())
+    }
+    
+    /// Create a new window/tab
+    pub async fn new_window(&mut self, is_tab: bool) -> Result<String> {
+        let window_type = if is_tab { "tab" } else { "window" };
+        let response = self.client.new_window(window_type == "tab").await?;
+        Ok(response.handle.into())
+    }
+    
+    /// Get cookies
+    pub async fn get_cookies(&mut self) -> Result<Vec<fantoccini::cookies::Cookie<'static>>> {
+        Ok(self.client.get_all_cookies().await?)
+    }
+    
+    /// Add a cookie
+    pub async fn add_cookie(&mut self, cookie: fantoccini::cookies::Cookie<'static>) -> Result<()> {
+        self.client.add_cookie(cookie).await?;
+        Ok(())
+    }
+    
+    /// Delete all cookies
+    pub async fn delete_all_cookies(&mut self) -> Result<()> {
+        self.client.delete_all_cookies().await?;
+        Ok(())
+    }
+    
+    /// Wait for an element to appear (with timeout)
+    pub async fn wait_for_element(&mut self, selector: &str, timeout: Duration) -> Result<WebElement> {
+        let start = std::time::Instant::now();
+        let poll_interval = Duration::from_millis(100);
+        
+        loop {
+            if let Ok(elem) = self.find_element(selector).await {
+                return Ok(elem);
+            }
+            
+            if start.elapsed() >= timeout {
+                anyhow::bail!("Timeout waiting for element: {}", selector);
+            }
+            
+            tokio::time::sleep(poll_interval).await;
+        }
+    }
+    
+    /// Wait for an element to be visible (with timeout)
+    pub async fn wait_for_visible(&mut self, selector: &str, timeout: Duration) -> Result<WebElement> {
+        let start = std::time::Instant::now();
+        let poll_interval = Duration::from_millis(100);
+        
+        loop {
+            if let Ok(elem) = self.find_element(selector).await {
+                if elem.is_displayed().await.unwrap_or(false) {
+                    return Ok(elem);
+                }
+            }
+            
+            if start.elapsed() >= timeout {
+                anyhow::bail!("Timeout waiting for element to be visible: {}", selector);
+            }
+            
+            tokio::time::sleep(poll_interval).await;
+        }
+    }
+}
+
+#[async_trait]
+impl WebDriverController for SafariDriver {
+    async fn navigate(&mut self, url: &str) -> Result<()> {
+        self.client.goto(url).await?;
+        Ok(())
+    }
+    
+    async fn current_url(&self) -> Result<String> {
+        Ok(self.client.current_url().await?.to_string())
+    }
+    
+    async fn title(&self) -> Result<String> {
+        Ok(self.client.title().await?)
+    }
+    
+    async fn find_element(&mut self, selector: &str) -> Result<WebElement> {
+        let elem = self.client.find(fantoccini::Locator::Css(selector)).await
+            .context(format!("Failed to find element with selector: {}", selector))?;
+        Ok(WebElement { inner: elem })
+    }
+    
+    async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>> {
+        let elems = self.client.find_all(fantoccini::Locator::Css(selector)).await?;
+        Ok(elems.into_iter().map(|inner| WebElement { inner }).collect())
+    }
+    
+    async fn execute_script(&mut self, script: &str, args: Vec<Value>) -> Result<Value> {
+        Ok(self.client.execute(script, args).await?)
+    }
+    
+    async fn page_source(&self) -> Result<String> {
+        Ok(self.client.source().await?)
+    }
+    
+    async fn screenshot(&mut self, path: &str) -> Result<()> {
+        let screenshot_data = self.client.screenshot().await?;
+        
+        // Expand tilde in path
+        let expanded_path = shellexpand::tilde(path);
+        let path_str = expanded_path.as_ref();
+        
+        // Create parent directories if needed
+        if let Some(parent) = std::path::Path::new(path_str).parent() {
+            std::fs::create_dir_all(parent)
+                .context("Failed to create parent directories for screenshot")?;
+        }
+        
+        std::fs::write(path_str, screenshot_data)
+            .context("Failed to write screenshot to file")?;
+        
+        Ok(())
+    }
+    
+    async fn close(&mut self) -> Result<()> {
+        self.client.close_window().await?;
+        Ok(())
+    }
+    
+    async fn quit(mut self) -> Result<()> {
+        self.client.close().await?;
+        Ok(())
+    }
+}
--- a/crates/g3-computer-control/tests/integration_test.rs
+++ b/crates/g3-computer-control/tests/integration_test.rs
@@ -0,0 +1,62 @@
+use g3_computer_control::*;
+
+#[tokio::test]
+async fn test_mouse_movement() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Move mouse to center of screen (assuming 1920x1080)
+    let result = controller.move_mouse(960, 540).await;
+    assert!(result.is_ok(), "Failed to move mouse: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_typing() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Type some text
+    let result = controller.type_text("Hello, World!").await;
+    assert!(result.is_ok(), "Failed to type text: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_screenshot() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Take screenshot
+    let path = "/tmp/test_screenshot.png";
+    let result = controller.take_screenshot(path, None, None).await;
+    assert!(result.is_ok(), "Failed to take screenshot: {:?}", result.err());
+    
+    // Verify file exists
+    assert!(std::path::Path::new(path).exists(), "Screenshot file was not created");
+    
+    // Clean up
+    let _ = std::fs::remove_file(path);
+}
+
+#[tokio::test]
+async fn test_click() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Click at a safe location
+    let result = controller.click(types::MouseButton::Left).await;
+    assert!(result.is_ok(), "Failed to click: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_double_click() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Double click
+    let result = controller.double_click(types::MouseButton::Left).await;
+    assert!(result.is_ok(), "Failed to double click: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_press_key() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Press escape key
+    let result = controller.press_key("escape").await;
+    assert!(result.is_ok(), "Failed to press key: {:?}", result.err());
+}
--- a/crates/g3-config/src/autonomous_config_tests.rs
+++ b/crates/g3-config/src/autonomous_config_tests.rs
@@ -0,0 +1,131 @@
+#[cfg(test)]
+mod autonomous_config_tests {
+    use crate::{Config, AnthropicConfig, DatabricksConfig};
+
+    #[test]
+    fn test_default_autonomous_config() {
+        let config = Config::default();
+        assert!(config.autonomous.coach_provider.is_none());
+        assert!(config.autonomous.coach_model.is_none());
+        assert!(config.autonomous.player_provider.is_none());
+        assert!(config.autonomous.player_model.is_none());
+    }
+
+    #[test]
+    fn test_for_coach_with_overrides() {
+        let mut config = Config::default();
+        
+        // Set up base config with anthropic
+        config.providers.anthropic = Some(AnthropicConfig {
+            api_key: "test-key".to_string(),
+            model: "claude-3-5-sonnet-20241022".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+        });
+        
+        // Set coach overrides
+        config.autonomous.coach_provider = Some("anthropic".to_string());
+        config.autonomous.coach_model = Some("claude-3-opus-20240229".to_string());
+        
+        let coach_config = config.for_coach().unwrap();
+        
+        // Verify coach uses overridden provider and model
+        assert_eq!(coach_config.providers.default_provider, "anthropic");
+        assert_eq!(
+            coach_config.providers.anthropic.as_ref().unwrap().model,
+            "claude-3-opus-20240229"
+        );
+    }
+
+    #[test]
+    fn test_for_player_with_overrides() {
+        let mut config = Config::default();
+        
+        // Set up base config with databricks
+        config.providers.databricks = Some(DatabricksConfig {
+            host: "https://test.databricks.com".to_string(),
+            token: Some("test-token".to_string()),
+            model: "databricks-meta-llama-3-1-70b-instruct".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+            use_oauth: Some(false),
+        });
+        
+        // Set player overrides
+        config.autonomous.player_provider = Some("databricks".to_string());
+        config.autonomous.player_model = Some("databricks-dbrx-instruct".to_string());
+        
+        let player_config = config.for_player().unwrap();
+        
+        // Verify player uses overridden provider and model
+        assert_eq!(player_config.providers.default_provider, "databricks");
+        assert_eq!(
+            player_config.providers.databricks.as_ref().unwrap().model,
+            "databricks-dbrx-instruct"
+        );
+    }
+
+    #[test]
+    fn test_no_overrides_uses_defaults() {
+        let mut config = Config::default();
+        config.providers.default_provider = "databricks".to_string();
+        
+        let coach_config = config.for_coach().unwrap();
+        let player_config = config.for_player().unwrap();
+        
+        // Both should use the default provider when no overrides
+        assert_eq!(coach_config.providers.default_provider, "databricks");
+        assert_eq!(player_config.providers.default_provider, "databricks");
+    }
+
+    #[test]
+    fn test_provider_override_only() {
+        let mut config = Config::default();
+        
+        config.providers.anthropic = Some(AnthropicConfig {
+            api_key: "test-key".to_string(),
+            model: "claude-3-5-sonnet-20241022".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+        });
+        
+        // Only override provider, not model
+        config.autonomous.coach_provider = Some("anthropic".to_string());
+        
+        let coach_config = config.for_coach().unwrap();
+        
+        // Should use overridden provider with its default model
+        assert_eq!(coach_config.providers.default_provider, "anthropic");
+        assert_eq!(
+            coach_config.providers.anthropic.as_ref().unwrap().model,
+            "claude-3-5-sonnet-20241022"
+        );
+    }
+
+    #[test]
+    fn test_model_override_only() {
+        let mut config = Config::default();
+        config.providers.default_provider = "databricks".to_string();
+        
+        config.providers.databricks = Some(DatabricksConfig {
+            host: "https://test.databricks.com".to_string(),
+            token: Some("test-token".to_string()),
+            model: "databricks-meta-llama-3-1-70b-instruct".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+            use_oauth: Some(false),
+        });
+        
+        // Only override model, not provider
+        config.autonomous.player_model = Some("databricks-dbrx-instruct".to_string());
+        
+        let player_config = config.for_player().unwrap();
+        
+        // Should use default provider with overridden model
+        assert_eq!(player_config.providers.default_provider, "databricks");
+        assert_eq!(
+            player_config.providers.databricks.as_ref().unwrap().model,
+            "databricks-dbrx-instruct"
+        );
+    }
+}
--- a/crates/g3-config/src/lib.rs
+++ b/crates/g3-config/src/lib.rs
@@ -2,16 +2,23 @@ use serde::{Deserialize, Serialize};
 use anyhow::Result;
 use std::path::Path;

+#[cfg(test)]
+mod autonomous_config_tests;
+
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct Config {
    pub providers: ProvidersConfig,
    pub agent: AgentConfig,
+    pub computer_control: ComputerControlConfig,
+    pub webdriver: WebDriverConfig,
+    pub autonomous: AutonomousConfig,
 }

 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct ProvidersConfig {
    pub openai: Option<OpenAIConfig>,
    pub anthropic: Option<AnthropicConfig>,
+    pub databricks: Option<DatabricksConfig>,
    pub embedded: Option<EmbeddedConfig>,
    pub default_provider: String,
 }
@@ -33,6 +40,16 @@ pub struct AnthropicConfig {
    pub temperature: Option<f32>,
 }

+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct DatabricksConfig {
+    pub host: String,
+    pub token: Option<String>, // Optional - will use OAuth if not provided
+    pub model: String,
+    pub max_tokens: Option<u32>,
+    pub temperature: Option<f32>,
+    pub use_oauth: Option<bool>, // Default to true if token not provided
+}
+
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct EmbeddedConfig {
    pub model_path: String,
@@ -51,20 +68,77 @@ pub struct AgentConfig {
    pub timeout_seconds: u64,
 }

+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct ComputerControlConfig {
+    pub enabled: bool,
+    pub require_confirmation: bool,
+    pub max_actions_per_second: u32,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct WebDriverConfig {
+    pub enabled: bool,
+    pub safari_port: u16,
+}
+
+impl Default for WebDriverConfig {
+    fn default() -> Self {
+        Self {
+            enabled: false,
+            safari_port: 4444,
+        }
+    }
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct AutonomousConfig {
+    pub coach_provider: Option<String>,
+    pub coach_model: Option<String>,
+    pub player_provider: Option<String>,
+    pub player_model: Option<String>,
+}
+
+impl Default for AutonomousConfig {
+    fn default() -> Self {
+        Self { coach_provider: None, coach_model: None, player_provider: None, player_model: None }
+    }
+}
+
+impl Default for ComputerControlConfig {
+    fn default() -> Self {
+        Self {
+            enabled: false, // Disabled by default for safety
+            require_confirmation: true,
+            max_actions_per_second: 5,
+        }
+    }
+}
+
 impl Default for Config {
    fn default() -> Self {
        Self {
            providers: ProvidersConfig {
                openai: None,
                anthropic: None,
+                databricks: Some(DatabricksConfig {
+                    host: "https://your-workspace.cloud.databricks.com".to_string(),
+                    token: None, // Will use OAuth by default
+                    model: "databricks-claude-sonnet-4".to_string(),
+                    max_tokens: Some(4096),
+                    temperature: Some(0.1),
+                    use_oauth: Some(true),
+                }),
                embedded: None,
-                default_provider: "anthropic".to_string(),
+                default_provider: "databricks".to_string(),
            },
            agent: AgentConfig {
                max_context_length: 8192,
                enable_streaming: true,
                timeout_seconds: 60,
            },
+            computer_control: ComputerControlConfig::default(),
+            webdriver: WebDriverConfig::default(),
+            autonomous: AutonomousConfig::default(),
        }
    }
 }
@@ -88,9 +162,9 @@ impl Config {
            })
        };
        
-        // If no config exists, create and save a default Qwen config
+        // If no config exists, create and save a default Databricks config
        if !config_exists {
-            let qwen_config = Self::default_qwen_config();
+            let databricks_config = Self::default();
            
            // Save to default location
            let config_dir = dirs::home_dir()
@@ -105,13 +179,13 @@ impl Config {
            std::fs::create_dir_all(&config_dir).ok();
            
            let config_file = config_dir.join("config.toml");
-            if let Err(e) = qwen_config.save(config_file.to_str().unwrap()) {
+            if let Err(e) = databricks_config.save(config_file.to_str().unwrap()) {
                eprintln!("Warning: Could not save default config: {}", e);
            } else {
-                println!("Created default Qwen configuration at: {}", config_file.display());
+                println!("Created default Databricks configuration at: {}", config_file.display());
            }
            
-            return Ok(qwen_config);
+            return Ok(databricks_config);
        }
        
        // Existing config loading logic
@@ -152,11 +226,13 @@ impl Config {
        Ok(config)
    }

+    #[allow(dead_code)]
    fn default_qwen_config() -> Self {
        Self {
            providers: ProvidersConfig {
                openai: None,
                anthropic: None,
+                databricks: None,
                embedded: Some(EmbeddedConfig {
                    model_path: "~/.cache/g3/models/qwen2.5-7b-instruct-q3_k_m.gguf".to_string(),
                    model_type: "qwen".to_string(),
@@ -173,6 +249,9 @@ impl Config {
                enable_streaming: true,
                timeout_seconds: 60,
            },
+            computer_control: ComputerControlConfig::default(),
+            webdriver: WebDriverConfig::default(),
+            autonomous: AutonomousConfig::default(),
        }
    }
    
@@ -181,4 +260,138 @@ impl Config {
        std::fs::write(path, toml_string)?;
        Ok(())
    }
+    
+    pub fn load_with_overrides(
+        config_path: Option<&str>,
+        provider_override: Option<String>,
+        model_override: Option<String>,
+    ) -> Result<Self> {
+        // Load the base configuration
+        let mut config = Self::load(config_path)?;
+        
+        // Apply provider override
+        if let Some(provider) = provider_override {
+            config.providers.default_provider = provider;
+        }
+        
+        // Apply model override to the active provider
+        if let Some(model) = model_override {
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = model;
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = model;
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                "embedded" => {
+                    if let Some(ref mut embedded) = config.providers.embedded {
+                        embedded.model_path = model;
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Provider 'embedded' is not configured. Please add embedded configuration to your config file."
+                        ));
+                    }
+                }
+                "openai" => {
+                    if let Some(ref mut openai) = config.providers.openai {
+                        openai.model = model;
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Provider 'openai' is not configured. Please add openai configuration to your config file."
+                        ));
+                    }
+                }
+                _ => return Err(anyhow::anyhow!("Unknown provider: {}", 
+                    config.providers.default_provider)),
+            }
+        }
+        
+        Ok(config)
+    }
+    
+    /// Create a config for the coach agent in autonomous mode
+    pub fn for_coach(&self) -> Result<Self> {
+        let mut config = self.clone();
+        
+        // Apply coach-specific overrides if configured
+        if let Some(ref coach_provider) = self.autonomous.coach_provider {
+            config.providers.default_provider = coach_provider.clone();
+        }
+        
+        if let Some(ref coach_model) = self.autonomous.coach_model {
+            // Apply model override to the coach's provider
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = coach_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Coach provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = coach_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Coach provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                _ => {}
+            }
+        }
+        
+        Ok(config)
+    }
+    
+    /// Create a config for the player agent in autonomous mode
+    pub fn for_player(&self) -> Result<Self> {
+        let mut config = self.clone();
+        
+        // Apply player-specific overrides if configured
+        if let Some(ref player_provider) = self.autonomous.player_provider {
+            config.providers.default_provider = player_provider.clone();
+        }
+        
+        if let Some(ref player_model) = self.autonomous.player_model {
+            // Apply model override to the player's provider
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = player_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Player provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = player_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Player provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                _ => {}
+            }
+        }
+        
+        Ok(config)
+    }
 }
--- a/crates/g3-core/Cargo.toml
+++ b/crates/g3-core/Cargo.toml
@@ -8,6 +8,7 @@ description = "Core engine for G3 AI coding agent"
 g3-providers = { path = "../g3-providers" }
 g3-config = { path = "../g3-config" }
 g3-execution = { path = "../g3-execution" }
+g3-computer-control = { path = "../g3-computer-control" }
 tokio = { workspace = true }
 reqwest = { workspace = true }
 anyhow = { workspace = true }
@@ -18,7 +19,9 @@ serde_json = { workspace = true }
 uuid = { workspace = true }
 async-trait = "0.1"
 tokio-stream = "0.1"
-llama_cpp = { version = "0.3.2", features = ["metal"] }
-shellexpand = "3.1"
 tokio-util = "0.7"
 futures-util = "0.3"
+chrono = { version = "0.4", features = ["serde"] }
+rand = "0.8"
+regex = "1.0"
+shellexpand = "3.1"
--- a/crates/g3-core/src/error_handling.rs
+++ b/crates/g3-core/src/error_handling.rs
@@ -0,0 +1,501 @@
+//! Error handling module for G3 with retry logic and detailed logging
+//!
+//! This module provides:
+//! - Classification of errors as recoverable or non-recoverable
+//! - Retry logic with exponential backoff and jitter for recoverable errors
+//! - Detailed error logging with context information
+//! - Request/response capture for debugging
+
+use anyhow::Result;
+use serde::{Deserialize, Serialize};
+use std::time::Duration;
+use tracing::{error, info, warn};
+
+/// Maximum number of retry attempts for recoverable errors (default mode)
+const DEFAULT_MAX_RETRY_ATTEMPTS: u32 = 3;
+
+/// Maximum number of retry attempts for autonomous mode
+const AUTONOMOUS_MAX_RETRY_ATTEMPTS: u32 = 6;
+
+/// Base delay for exponential backoff (in milliseconds)
+const BASE_RETRY_DELAY_MS: u64 = 1000;
+
+/// Maximum delay between retries (in milliseconds) for default mode
+const DEFAULT_MAX_RETRY_DELAY_MS: u64 = 10000;
+
+/// Maximum delay between retries (in milliseconds) for autonomous mode
+/// Spread over 10 minutes (600 seconds) with 6 retries
+const AUTONOMOUS_MAX_RETRY_DELAY_MS: u64 = 120000; // 2 minutes max per retry
+
+// Removed unused constants AUTONOMOUS_RETRY_BUDGET_MS and DEFAULT_JITTER_FACTOR
+
+/// Jitter factor for autonomous mode (higher for better distribution)
+const JITTER_FACTOR: f64 = 0.3;
+
+/// Error context information for detailed logging
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct ErrorContext {
+    /// The operation that was being performed
+    pub operation: String,
+    /// The provider being used
+    pub provider: String,
+    /// The model being used
+    pub model: String,
+    /// The last prompt sent (truncated for logging)
+    pub last_prompt: String,
+    /// Raw request data (if available)
+    pub raw_request: Option<String>,
+    /// Raw response data (if available)
+    pub raw_response: Option<String>,
+    /// Stack trace
+    pub stack_trace: String,
+    /// Timestamp
+    pub timestamp: u64,
+    /// Number of tokens in context
+    pub context_tokens: u32,
+    /// Session ID if available
+    pub session_id: Option<String>,
+    /// Whether to skip file logging (quiet mode)
+    pub quiet: bool,
+}
+
+impl ErrorContext {
+    pub fn new(
+        operation: String,
+        provider: String,
+        model: String,
+        last_prompt: String,
+        session_id: Option<String>,
+        context_tokens: u32,
+        quiet: bool,
+    ) -> Self {
+        let timestamp = std::time::SystemTime::now()
+            .duration_since(std::time::UNIX_EPOCH)
+            .unwrap_or_default()
+            .as_secs();
+
+        // Capture stack trace
+        let stack_trace = std::backtrace::Backtrace::force_capture().to_string();
+
+        Self {
+            operation,
+            provider,
+            model,
+            last_prompt: truncate_for_logging(&last_prompt, 1000),
+            raw_request: None,
+            raw_response: None,
+            stack_trace,
+            timestamp,
+            context_tokens,
+            session_id,
+            quiet,
+        }
+    }
+
+    pub fn with_request(mut self, request: String) -> Self {
+        self.raw_request = Some(truncate_for_logging(&request, 5000));
+        self
+    }
+
+    pub fn with_response(mut self, response: String) -> Self {
+        self.raw_response = Some(truncate_for_logging(&response, 5000));
+        self
+    }
+
+    /// Log the error context with ERROR level
+    pub fn log_error(&self, error: &anyhow::Error) {
+        error!("=== G3 ERROR DETAILS ===");
+        error!("Operation: {}", self.operation);
+        error!("Provider: {} | Model: {}", self.provider, self.model);
+        error!("Error: {}", error);
+        error!("Timestamp: {}", self.timestamp);
+        error!("Session ID: {:?}", self.session_id);
+        error!("Context Tokens: {}", self.context_tokens);
+        error!("Last Prompt: {}", self.last_prompt);
+        
+        if let Some(ref req) = self.raw_request {
+            error!("Raw Request: {}", req);
+        }
+        
+        if let Some(ref resp) = self.raw_response {
+            error!("Raw Response: {}", resp);
+        }
+        
+        error!("Stack Trace:\n{}", self.stack_trace);
+        error!("=== END ERROR DETAILS ===");
+
+        // Also save to error log file
+        self.save_to_file();
+    }
+
+    /// Save error context to a file for later analysis
+    fn save_to_file(&self) {
+        // Skip file logging if quiet mode is enabled
+        if self.quiet {
+            return;
+        }
+
+        let logs_dir = std::path::Path::new("logs/errors");
+        if !logs_dir.exists() {
+            if let Err(e) = std::fs::create_dir_all(logs_dir) {
+                error!("Failed to create error logs directory: {}", e);
+                return;
+            }
+        }
+
+        let filename = format!(
+            "logs/errors/error_{}_{}.json",
+            self.timestamp,
+            self.session_id.as_deref().unwrap_or("unknown")
+        );
+
+        match serde_json::to_string_pretty(self) {
+            Ok(json_content) => {
+                if let Err(e) = std::fs::write(&filename, json_content) {
+                    error!("Failed to save error context to {}: {}", filename, e);
+                } else {
+                    info!("Error details saved to: {}", filename);
+                }
+            }
+            Err(e) => {
+                error!("Failed to serialize error context: {}", e);
+            }
+        }
+    }
+}
+
+/// Classification of error types
+#[derive(Debug, Clone, PartialEq)]
+pub enum ErrorType {
+    /// Recoverable errors that should be retried
+    Recoverable(RecoverableError),
+    /// Non-recoverable errors that should terminate execution
+    NonRecoverable,
+}
+
+/// Types of recoverable errors
+#[derive(Debug, Clone, PartialEq)]
+pub enum RecoverableError {
+    /// Rate limit exceeded
+    RateLimit,
+    /// Temporary network error
+    NetworkError,
+    /// Server error (5xx)
+    ServerError,
+    /// Model is busy/overloaded
+    ModelBusy,
+    /// Timeout
+    Timeout,
+    /// Token limit exceeded (might be recoverable with summarization)
+    TokenLimit,
+}
+
+/// Classify an error as recoverable or non-recoverable
+pub fn classify_error(error: &anyhow::Error) -> ErrorType {
+    let error_str = error.to_string().to_lowercase();
+
+    // Check for recoverable error patterns
+    if error_str.contains("rate limit") || error_str.contains("rate_limit") || error_str.contains("429") {
+        return ErrorType::Recoverable(RecoverableError::RateLimit);
+    }
+
+    if error_str.contains("network") || error_str.contains("connection") || 
+       error_str.contains("dns") || error_str.contains("refused") {
+        return ErrorType::Recoverable(RecoverableError::NetworkError);
+    }
+
+    if error_str.contains("500") || error_str.contains("502") || 
+       error_str.contains("503") || error_str.contains("504") ||
+       error_str.contains("server error") || error_str.contains("internal error") {
+        return ErrorType::Recoverable(RecoverableError::ServerError);
+    }
+
+    if error_str.contains("busy") || error_str.contains("overloaded") || 
+       error_str.contains("capacity") || error_str.contains("unavailable") {
+        return ErrorType::Recoverable(RecoverableError::ModelBusy);
+    }
+
+    // Enhanced timeout detection - check for various timeout patterns
+    if error_str.contains("timeout") || 
+       error_str.contains("timed out") || 
+       error_str.contains("operation timed out") ||
+       error_str.contains("request or response body error") ||  // Common timeout pattern
+       error_str.contains("stream error") && error_str.contains("timed out") {
+        return ErrorType::Recoverable(RecoverableError::Timeout);
+    }
+
+    if error_str.contains("token") && (error_str.contains("limit") || error_str.contains("exceeded")) {
+        return ErrorType::Recoverable(RecoverableError::TokenLimit);
+    }
+
+    // Default to non-recoverable for unknown errors
+    ErrorType::NonRecoverable
+}
+
+/// Calculate retry delay for autonomous mode with better distribution over 10 minutes
+fn calculate_autonomous_retry_delay(attempt: u32) -> Duration {
+    use rand::Rng;
+    let mut rng = rand::thread_rng();
+    
+    // Distribute 6 retries over 10 minutes (600 seconds)
+    // Base delays: 10s, 30s, 60s, 120s, 180s, 200s = 600s total
+    let base_delays_ms = [10000, 30000, 60000, 120000, 180000, 200000];
+    let base_delay = base_delays_ms.get(attempt.saturating_sub(1) as usize).unwrap_or(&200000);
+    
+    // Add jitter of ±30% to prevent thundering herd
+    let jitter = (*base_delay as f64 * 0.3 * rng.gen::<f64>()) as u64;
+    let final_delay = if rng.gen_bool(0.5) {
+        base_delay + jitter
+    } else {
+        base_delay.saturating_sub(jitter)
+    };
+    
+    Duration::from_millis(final_delay)
+}
+
+/// Calculate retry delay with exponential backoff and jitter
+pub fn calculate_retry_delay(attempt: u32, is_autonomous: bool) -> Duration {
+    if is_autonomous {
+        return calculate_autonomous_retry_delay(attempt);
+    }
+    
+    use rand::Rng;
+    let max_retry_delay_ms = if is_autonomous { AUTONOMOUS_MAX_RETRY_DELAY_MS } else { DEFAULT_MAX_RETRY_DELAY_MS };
+    
+    // Exponential backoff: delay = base * 2^attempt
+    let base_delay = BASE_RETRY_DELAY_MS * (2_u64.pow(attempt.saturating_sub(1)));
+    let capped_delay = base_delay.min(max_retry_delay_ms);
+    
+    // Add jitter to prevent thundering herd
+    let mut rng = rand::thread_rng();
+    let jitter = (capped_delay as f64 * JITTER_FACTOR * rng.gen::<f64>()) as u64;
+    let final_delay = if rng.gen_bool(0.5) {
+        capped_delay + jitter
+    } else {
+        capped_delay.saturating_sub(jitter)
+    };
+    
+    Duration::from_millis(final_delay)
+}
+
+/// Retry logic for async operations
+pub async fn retry_with_backoff<F, Fut, T>(
+    operation_name: &str,
+    mut operation: F,
+    context: &ErrorContext,
+    is_autonomous: bool,
+) -> Result<T>
+where
+    F: FnMut() -> Fut,
+    Fut: std::future::Future<Output = Result<T>>,
+{
+    let mut attempt = 0;
+    let mut _last_error = None;
+
+    loop {
+        attempt += 1;
+        
+        match operation().await {
+            Ok(result) => {
+                if attempt > 1 {
+                    info!(
+                        "Operation '{}' succeeded after {} attempts",
+                        operation_name, attempt
+                    );
+                }
+                return Ok(result);
+            }
+            Err(error) => {
+                let error_type = classify_error(&error);
+                let max_attempts = if is_autonomous { AUTONOMOUS_MAX_RETRY_ATTEMPTS } else { DEFAULT_MAX_RETRY_ATTEMPTS };
+                
+                match error_type {
+                    ErrorType::Recoverable(recoverable_type) => {
+                        if attempt >= max_attempts {
+                            error!(
+                                "Operation '{}' failed after {} attempts. Giving up.",
+                                operation_name, attempt
+                            );
+                            context.clone().log_error(&error);
+                            return Err(error);
+                        }
+                        
+                        let delay = calculate_retry_delay(attempt, is_autonomous);
+                        warn!(
+                            "Recoverable error ({:?}) in '{}' (attempt {}/{}). Retrying in {:?}...",
+                            recoverable_type, operation_name, attempt, max_attempts, delay
+                        );
+                        warn!("Error details: {}", error);
+                        
+                        // Special handling for token limit errors
+                        if matches!(recoverable_type, RecoverableError::TokenLimit) {
+                            info!("Token limit error detected. Consider triggering summarization.");
+                        }
+                        
+                        tokio::time::sleep(delay).await;
+                        _last_error = Some(error);
+                    }
+                    ErrorType::NonRecoverable => {
+                        error!(
+                            "Non-recoverable error in '{}' (attempt {}). Terminating.",
+                            operation_name, attempt
+                        );
+                        context.clone().log_error(&error);
+                        return Err(error);
+                    }
+                }
+            }
+        }
+    }
+}
+
+/// Helper function to truncate strings for logging
+fn truncate_for_logging(s: &str, max_len: usize) -> String {
+    if s.len() <= max_len {
+        s.to_string()
+    } else {
+        // Find a safe UTF-8 boundary to truncate at
+        // We need to ensure we don't cut in the middle of a multi-byte character
+        let mut truncate_at = max_len;
+        
+        // Walk backwards from max_len to find a character boundary
+        while truncate_at > 0 && !s.is_char_boundary(truncate_at) {
+            truncate_at -= 1;
+        }
+        
+        // If we couldn't find a boundary (shouldn't happen), use a safe default
+        if truncate_at == 0 {
+            truncate_at = max_len.min(s.len());
+        }
+        
+        format!("{}... (truncated, {} total bytes)", &s[..truncate_at], s.len())
+    }
+}
+
+/// Macro for creating error context easily
+#[macro_export]
+macro_rules! error_context {
+    ($operation:expr, $provider:expr, $model:expr, $prompt:expr, $session_id:expr, $tokens:expr) => {
+        $crate::error_handling::ErrorContext::new(
+            $operation.to_string(),
+            $provider.to_string(),
+            $model.to_string(),
+            $prompt.to_string(),
+            $session_id,
+            $tokens,
+        )
+    };
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use anyhow::anyhow;
+
+    #[test]
+    fn test_error_classification() {
+        // Rate limit errors
+        let error = anyhow!("Rate limit exceeded");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::RateLimit));
+        
+        let error = anyhow!("HTTP 429 Too Many Requests");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::RateLimit));
+        
+        // Network errors
+        let error = anyhow!("Network connection failed");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::NetworkError));
+        
+        // Server errors
+        let error = anyhow!("HTTP 503 Service Unavailable");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::ServerError));
+        
+        // Model busy
+        let error = anyhow!("Model is busy, please try again");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::ModelBusy));
+        
+        // Timeout
+        let error = anyhow!("Request timed out");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::Timeout));
+        
+        // Token limit
+        let error = anyhow!("Token limit exceeded");
+        assert_eq!(classify_error(&error), ErrorType::Recoverable(RecoverableError::TokenLimit));
+        
+        // Non-recoverable
+        let error = anyhow!("Invalid API key");
+        assert_eq!(classify_error(&error), ErrorType::NonRecoverable);
+        
+        let error = anyhow!("Malformed request");
+        assert_eq!(classify_error(&error), ErrorType::NonRecoverable);
+    }
+
+    #[test]
+    fn test_retry_delay_calculation() {
+        // Test that delays increase exponentially
+        let delay1 = calculate_retry_delay(1, false);
+        let delay2 = calculate_retry_delay(2, false);
+        let delay3 = calculate_retry_delay(3, false);
+        
+        // Due to jitter, we can't test exact values, but the base should increase
+        assert!(delay1.as_millis() >= (BASE_RETRY_DELAY_MS as f64 * 0.7) as u128);
+        assert!(delay1.as_millis() <= (BASE_RETRY_DELAY_MS as f64 * 1.3) as u128);
+        
+        // Delay 2 should be roughly 2x delay 1 (minus jitter)
+        assert!(delay2.as_millis() >= delay1.as_millis());
+        
+        // Delay 3 should be roughly 2x delay 2 (minus jitter)
+        assert!(delay3.as_millis() >= delay2.as_millis());
+        
+        // Test max cap
+        let delay_max = calculate_retry_delay(10, false);
+        assert!(delay_max.as_millis() <= (DEFAULT_MAX_RETRY_DELAY_MS as f64 * 1.3) as u128);
+    }
+
+    #[test]
+    fn test_autonomous_retry_delay_calculation() {
+        // Test autonomous mode delays are distributed over 10 minutes
+        let delay1 = calculate_retry_delay(1, true);
+        let delay2 = calculate_retry_delay(2, true);
+        let delay3 = calculate_retry_delay(3, true);
+        let delay4 = calculate_retry_delay(4, true);
+        let delay5 = calculate_retry_delay(5, true);
+        let delay6 = calculate_retry_delay(6, true);
+        
+        // Base delays should be around: 10s, 30s, 60s, 120s, 180s, 200s
+        // With ±30% jitter
+        assert!(delay1.as_millis() >= 7000 && delay1.as_millis() <= 13000);
+        assert!(delay2.as_millis() >= 21000 && delay2.as_millis() <= 39000);
+        assert!(delay3.as_millis() >= 42000 && delay3.as_millis() <= 78000);
+        assert!(delay4.as_millis() >= 84000 && delay4.as_millis() <= 156000);
+        assert!(delay5.as_millis() >= 126000 && delay5.as_millis() <= 234000);
+        assert!(delay6.as_millis() >= 140000 && delay6.as_millis() <= 260000);
+    }
+
+    #[test]
+    fn test_truncate_for_logging() {
+        let short_text = "Hello, world!";
+        assert_eq!(truncate_for_logging(short_text, 20), "Hello, world!");
+        
+        let long_text = "This is a very long text that should be truncated for logging purposes";
+        let truncated = truncate_for_logging(long_text, 20);
+        assert!(truncated.starts_with("This is a very long "));
+        assert!(truncated.contains("truncated"));
+        assert!(truncated.contains("total bytes"));
+    }
+    
+    #[test]
+    fn test_truncate_with_multibyte_chars() {
+        // Test with multi-byte UTF-8 characters
+        let text_with_emoji = "Hello 👋 World 🌍 Test ✨ More text here";
+        let truncated = truncate_for_logging(text_with_emoji, 10);
+        // Should truncate at a valid UTF-8 boundary
+        assert!(truncated.starts_with("Hello "));
+        
+        // Test with box-drawing characters like the one causing the panic
+        let text_with_box = "Some text ┌─────┐ more text";
+        let truncated = truncate_for_logging(text_with_box, 12);
+        // Should not panic and should truncate at a valid boundary
+        assert!(truncated.contains("Some text"));
+        assert!(truncated.contains("truncated"));
+    }
+}
--- a/crates/g3-core/src/error_handling_test.rs
+++ b/crates/g3-core/src/error_handling_test.rs
@@ -0,0 +1,154 @@
+//! Integration tests for error handling with retry logic
+
+#[cfg(test)]
+mod tests {
+    use crate::error_handling::*;
+    use std::sync::atomic::{AtomicU32, Ordering};
+    use std::sync::Arc;
+
+    #[tokio::test]
+    async fn test_retry_with_recoverable_error() {
+        let attempt_count = Arc::new(AtomicU32::new(0));
+
+        let context = ErrorContext::new(
+            "test_operation".to_string(),
+            "test_provider".to_string(),
+            "test_model".to_string(),
+            "test prompt".to_string(),
+            None,
+            100,
+            false,  // quiet parameter
+        );
+
+        let result = retry_with_backoff(
+            "test_operation",
+            || {
+                let counter = Arc::clone(&attempt_count);
+                async move {
+                    let count = counter.fetch_add(1, Ordering::SeqCst);
+                    if count < 2 {
+                        // Fail with recoverable error on first two attempts
+                        Err(anyhow::anyhow!("Rate limit exceeded"))
+                    } else {
+                        // Succeed on third attempt
+                        Ok("Success")
+                    }
+                }
+            },
+            &context,
+            false, // not autonomous mode
+        )
+        .await;
+
+        assert!(result.is_ok());
+        assert_eq!(result.unwrap(), "Success");
+        assert_eq!(attempt_count.load(Ordering::SeqCst), 3);
+    }
+
+    #[tokio::test]
+    async fn test_retry_with_non_recoverable_error() {
+        let attempt_count = Arc::new(AtomicU32::new(0));
+
+        let context = ErrorContext::new(
+            "test_operation".to_string(),
+            "test_provider".to_string(),
+            "test_model".to_string(),
+            "test prompt".to_string(),
+            None,
+            100,
+            false,  // quiet parameter
+        );
+
+        let result: Result<&str, _> = retry_with_backoff(
+            "test_operation",
+            || {
+                let counter = Arc::clone(&attempt_count);
+                async move {
+                    counter.fetch_add(1, Ordering::SeqCst);
+                    // Always fail with non-recoverable error
+                    Err(anyhow::anyhow!("Invalid API key"))
+                }
+            },
+            &context,
+            false, // not autonomous mode
+        )
+        .await;
+
+        assert!(result.is_err());
+        assert_eq!(attempt_count.load(Ordering::SeqCst), 1); // Should only try once
+    }
+
+    #[tokio::test]
+    async fn test_retry_exhaustion() {
+        let attempt_count = Arc::new(AtomicU32::new(0));
+
+        let context = ErrorContext::new(
+            "test_operation".to_string(),
+            "test_provider".to_string(),
+            "test_model".to_string(),
+            "test prompt".to_string(),
+            None,
+            100,
+            false,  // quiet parameter
+        );
+
+        let result: Result<&str, _> = retry_with_backoff(
+            "test_operation",
+            || {
+                let counter = Arc::clone(&attempt_count);
+                async move {
+                    counter.fetch_add(1, Ordering::SeqCst);
+                    // Always fail with recoverable error
+                    Err(anyhow::anyhow!("Network connection failed"))
+                }
+            },
+            &context,
+            false, // not autonomous mode
+        )
+        .await;
+
+        assert!(result.is_err());
+        assert_eq!(attempt_count.load(Ordering::SeqCst), 3); // Should try MAX_RETRY_ATTEMPTS times
+    }
+
+    #[test]
+    fn test_error_context_truncation() {
+        let long_prompt = "a".repeat(2000);
+        let context = ErrorContext::new(
+            "test_op".to_string(),
+            "provider".to_string(),
+            "model".to_string(),
+            long_prompt,
+            None,
+            100,
+            false,  // quiet parameter
+        );
+
+        // The prompt should be truncated to 1000 chars
+        assert!(context.last_prompt.len() < 1100); // Some buffer for the truncation message
+        assert!(context.last_prompt.contains("truncated"));
+    }
+
+    #[test]
+    fn test_retry_delay_increases() {
+        let delay1 = calculate_retry_delay(1, false);
+        let delay2 = calculate_retry_delay(2, false);
+        let delay3 = calculate_retry_delay(3, false);
+
+        // Delays should generally increase (though jitter can affect this)
+        // We'll test the base delays without jitter
+        let base1 = 1000u64; // BASE_RETRY_DELAY_MS
+        let base2 = 1000u64 * 2;
+        let base3 = 1000u64 * 4;
+
+        // Check that delays are within expected ranges (accounting for jitter)
+        assert!(delay1.as_millis() >= (base1 as f64 * 0.7) as u128);
+        assert!(delay1.as_millis() <= (base1 as f64 * 1.3) as u128);
+
+        assert!(delay2.as_millis() >= (base2 as f64 * 0.7) as u128);
+        assert!(delay2.as_millis() <= (base2 as f64 * 1.3) as u128);
+
+        assert!(delay3.as_millis() >= (base3 as f64 * 0.7) as u128);
+        assert!(delay3.as_millis() <= (base3 as f64 * 1.3) as u128);
+    }
+}
--- a/crates/g3-core/src/fixed_filter_json.rs
+++ b/crates/g3-core/src/fixed_filter_json.rs
@@ -0,0 +1,222 @@
+// FINAL CORRECTED implementation of filter_json_tool_calls function according to specification
+// 1. Detect tool call start with regex '\w*{\w*"tool"\w*:\w*"' on the very next newline
+// 2. Enter suppression mode and use brace counting to find complete JSON
+// 3. Only elide JSON content between first '{' and last '}' (inclusive)
+// 4. Return everything else as the final filtered string
+
+use regex::Regex;
+use std::cell::RefCell;
+use tracing::debug;
+
+// Thread-local state for tracking JSON tool call suppression
+thread_local! {
+    static FIXED_JSON_TOOL_STATE: RefCell<FixedJsonToolState> = RefCell::new(FixedJsonToolState::new());
+}
+
+#[derive(Debug, Clone)]
+struct FixedJsonToolState {
+    suppression_mode: bool,
+    brace_depth: i32,
+    buffer: String,
+    json_start_in_buffer: Option<usize>,
+    content_returned_up_to: usize, // Track how much content we've already returned
+}
+
+impl FixedJsonToolState {
+    fn new() -> Self {
+        Self {
+            suppression_mode: false,
+            brace_depth: 0,
+            buffer: String::new(),
+            json_start_in_buffer: None,
+            content_returned_up_to: 0,
+        }
+    }
+
+    fn reset(&mut self) {
+        self.suppression_mode = false;
+        self.brace_depth = 0;
+        self.buffer.clear();
+        self.json_start_in_buffer = None;
+        self.content_returned_up_to = 0;
+    }
+}
+
+// FINAL CORRECTED implementation according to specification
+
+pub fn fixed_filter_json_tool_calls(content: &str) -> String {
+    if content.is_empty() {
+        return String::new();
+    }
+
+    FIXED_JSON_TOOL_STATE.with(|state| {
+        let mut state = state.borrow_mut();
+
+        // Add new content to buffer
+        state.buffer.push_str(content);
+
+        // If we're already in suppression mode, continue brace counting
+        if state.suppression_mode {
+            // Count braces in the new content only
+            for ch in content.chars() {
+                match ch {
+                    '{' => state.brace_depth += 1,
+                    '}' => {
+                        state.brace_depth -= 1;
+                        // Exit suppression mode when all braces are closed
+                        if state.brace_depth <= 0 {
+                            debug!("JSON tool call completed - exiting suppression mode");
+
+                            // Extract the complete result with JSON filtered out
+                            let result = extract_fixed_content(
+                                &state.buffer,
+                                state.json_start_in_buffer.unwrap_or(0),
+                            );
+
+                            // Return only the part we haven't returned yet
+                            let new_content = if result.len() > state.content_returned_up_to {
+                                result[state.content_returned_up_to..].to_string()
+                            } else {
+                                String::new()
+                            };
+
+                            state.reset();
+                            return new_content;
+                        }
+                    }
+                    _ => {}
+                }
+            }
+            // Still in suppression mode, return empty string (content is being accumulated)
+            return String::new();
+        }
+
+        // Check for tool call pattern using corrected regex
+        // More flexible than the strict specification to handle real-world JSON
+        let tool_call_regex = Regex::new(r#"(?m)^\s*\{\s*"tool"\s*:\s*""#).unwrap();
+
+        if let Some(captures) = tool_call_regex.find(&state.buffer) {
+            let match_text = captures.as_str();
+
+            // Find the position of the opening brace in the match
+            if let Some(brace_offset) = match_text.find('{') {
+                let json_start = captures.start() + brace_offset;
+
+                debug!(
+                    "Detected JSON tool call at position {} - entering suppression mode",
+                    json_start
+                );
+
+                // Return content before JSON that we haven't returned yet
+                let content_before_json = if json_start >= state.content_returned_up_to {
+                    state.buffer[state.content_returned_up_to..json_start].to_string()
+                } else {
+                    String::new()
+                };
+
+                state.content_returned_up_to = json_start;
+
+                // Enter suppression mode
+                state.suppression_mode = true;
+                state.brace_depth = 0;
+                state.json_start_in_buffer = Some(json_start);
+
+                // Count braces from the JSON start to see if it's complete
+                let buffer_clone = state.buffer.clone();
+                for ch in buffer_clone[json_start..].chars() {
+                    match ch {
+                        '{' => state.brace_depth += 1,
+                        '}' => {
+                            state.brace_depth -= 1;
+                            if state.brace_depth <= 0 {
+                                // JSON is complete in this chunk
+                                debug!("JSON tool call completed in same chunk");
+                                let result = extract_fixed_content(&buffer_clone, json_start);
+
+                                // Return content before JSON plus content after JSON
+                                let content_after_json = if result.len() > json_start {
+                                    &result[json_start..]
+                                } else {
+                                    ""
+                                };
+
+                                let final_result =
+                                    format!("{}{}", content_before_json, content_after_json);
+                                state.reset();
+                                return final_result;
+                            }
+                        }
+                        _ => {}
+                    }
+                }
+
+                // JSON is incomplete, return only the content before JSON
+                return content_before_json;
+            }
+        }
+
+        // No JSON tool call detected, return only the new content we haven't returned yet
+        let new_content = if state.buffer.len() > state.content_returned_up_to {
+            let result = state.buffer[state.content_returned_up_to..].to_string();
+            state.content_returned_up_to = state.buffer.len();
+            result
+        } else {
+            String::new()
+        };
+
+        new_content
+    })
+}
+
+// Helper function to extract content with JSON tool call filtered out
+// Returns everything except the JSON between the first '{' and last '}' (inclusive)
+
+fn extract_fixed_content(full_content: &str, json_start: usize) -> String {
+    // Find the end of the JSON using proper brace counting with string handling
+    let mut brace_depth = 0;
+    let mut json_end = json_start;
+    let mut in_string = false;
+    let mut escape_next = false;
+
+    for (i, ch) in full_content[json_start..].char_indices() {
+        if escape_next {
+            escape_next = false;
+            continue;
+        }
+
+        match ch {
+            '\\' if in_string => escape_next = true,
+            '"' if !escape_next => in_string = !in_string,
+            '{' if !in_string => {
+                brace_depth += 1;
+            }
+            '}' if !in_string => {
+                brace_depth -= 1;
+                if brace_depth == 0 {
+                    json_end = json_start + i + 1; // +1 to include the closing brace
+                    break;
+                }
+            }
+            _ => {}
+        }
+    }
+
+    // Return content before and after the JSON (excluding the JSON itself)
+    let before = &full_content[..json_start];
+    let after = if json_end < full_content.len() {
+        &full_content[json_end..]
+    } else {
+        ""
+    };
+
+    format!("{}{}", before, after)
+}
+
+// Reset function for testing
+
+pub fn reset_fixed_json_tool_state() {
+    FIXED_JSON_TOOL_STATE.with(|state| {
+        let mut state = state.borrow_mut();
+        state.reset();
+    });
+}
--- a/crates/g3-core/src/fixed_filter_tests.rs
+++ b/crates/g3-core/src/fixed_filter_tests.rs
@@ -0,0 +1,332 @@
+#[cfg(test)]
+mod fixed_filter_tests {
+    use crate::fixed_filter_json::{fixed_filter_json_tool_calls, reset_fixed_json_tool_state};
+    use regex::Regex;
+
+    #[test]
+    fn test_no_tool_call_passthrough() {
+        reset_fixed_json_tool_state();
+        let input = "This is regular text without any tool calls.";
+        let result = fixed_filter_json_tool_calls(input);
+        assert_eq!(result, input);
+    }
+
+    #[test]
+    fn test_simple_tool_call_detection() {
+        reset_fixed_json_tool_state();
+        let input = r#"Some text before
+{"tool": "shell", "args": {"command": "ls"}}
+Some text after"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Some text before\n\nSome text after";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_streaming_chunks() {
+        reset_fixed_json_tool_state();
+
+        // Simulate streaming where the tool call comes in multiple chunks
+        let chunks = vec![
+            "Some text before\n",
+            "{\"tool\": \"",
+            "shell\", \"args\": {",
+            "\"command\": \"ls\"",
+            "}}\nText after",
+        ];
+
+        let mut results = Vec::new();
+        for chunk in chunks {
+            let result = fixed_filter_json_tool_calls(chunk);
+            results.push(result);
+        }
+
+        // The final accumulated result should have the JSON filtered out
+        let final_result: String = results.join("");
+        let expected = "Some text before\n\nText after";
+        assert_eq!(final_result, expected);
+    }
+
+    #[test]
+    fn test_nested_braces_in_tool_call() {
+        reset_fixed_json_tool_state();
+
+        let input = r#"Text before
+{"tool": "write_file", "args": {"file_path": "test.json", "content": "{\"nested\": \"value\"}"}}
+Text after"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Text before\n\nText after";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_regex_pattern_specification() {
+        // Test the corrected regex pattern that's more flexible with whitespace
+        let pattern = Regex::new(r#"(?m)^\s*\{\s*"tool"\s*:"#).unwrap();
+
+        let test_cases = vec![
+            (
+                r#"line
+{"tool":"#,
+                true,
+            ),
+            (
+                r#"line
+{"tool" :"#,
+                true,
+            ),
+            (
+                r#"line
+{ "tool":"#,
+                true,
+            ), // Space after { DOES match with \s*
+            (
+                r#"line
+abc{"tool":"#,
+                true,
+            ),
+            (
+                r#"line
+{"tool123":"#,
+                false,
+            ), // "tool123" is not exactly "tool"
+            (
+                r#"line
+{"tool" : "#,
+                true,
+            ),
+        ];
+
+        for (input, should_match) in test_cases {
+            let matches = pattern.is_match(input);
+            assert_eq!(
+                matches, should_match,
+                "Pattern matching failed for: {}",
+                input
+            );
+        }
+    }
+
+    #[test]
+    fn test_newline_requirement() {
+        reset_fixed_json_tool_state();
+
+        // According to spec, tool call should be detected "on the very next newline"
+        // Our current regex matches any line that contains the pattern, not just after newlines
+        let input_with_newline = "Text\n{\"tool\": \"shell\", \"args\": {\"command\": \"ls\"}}";
+        let input_without_newline = "Text {\"tool\": \"shell\", \"args\": {\"command\": \"ls\"}}";
+
+        let result1 = fixed_filter_json_tool_calls(input_with_newline);
+        reset_fixed_json_tool_state();
+        let result2 = fixed_filter_json_tool_calls(input_without_newline);
+
+        // Both cases currently trigger suppression due to regex pattern
+        // TODO: Fix regex to only match after actual newlines
+        assert_eq!(result1, "Text\n");
+        // This currently fails because our regex matches both cases
+        assert_eq!(result2, "Text ");
+    }
+
+    #[test]
+    fn test_json_with_escaped_quotes() {
+        reset_fixed_json_tool_state();
+
+        let input = r#"Text
+{"tool": "write_file", "args": {"content": "He said \"hello\" to me"}}
+More text"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Text\n\nMore text";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_edge_case_malformed_json() {
+        reset_fixed_json_tool_state();
+
+        // Test what happens with malformed JSON that starts like a tool call
+        let input = r#"Text
+{"tool": "shell", "args": {"command": "ls"
+More text"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        // Should handle gracefully - since JSON is incomplete, it should return content before JSON
+        let expected = "Text\n";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_multiple_tool_calls_sequential() {
+        reset_fixed_json_tool_state();
+
+        // Test processing multiple tool calls one at a time
+        let input1 = r#"First text
+{"tool": "shell", "args": {"command": "ls"}}
+Middle text"#;
+        let result1 = fixed_filter_json_tool_calls(input1);
+        let expected1 = "First text\n\nMiddle text";
+        assert_eq!(result1, expected1);
+
+        // Reset and process second tool call
+        reset_fixed_json_tool_state();
+        let input2 = r#"More text
+{"tool": "read_file", "args": {"file_path": "test.txt"}}
+Final text"#;
+        let result2 = fixed_filter_json_tool_calls(input2);
+        let expected2 = "More text\n\nFinal text";
+        assert_eq!(result2, expected2);
+    }
+
+    #[test]
+    fn test_tool_call_with_complex_args() {
+        reset_fixed_json_tool_state();
+
+        let input = r#"Before
+{"tool": "str_replace", "args": {"file_path": "test.rs", "diff": "--- old\n-old line\n+++ new\n+new line", "start": 0, "end": 100}}
+After"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Before\n\nAfter";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_tool_call_only() {
+        reset_fixed_json_tool_state();
+
+        let input = r#"
+{"tool": "final_output", "args": {"summary": "Task completed successfully"}}"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "\n";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_brace_counting_accuracy() {
+        reset_fixed_json_tool_state();
+
+        // Test complex nested structure
+        let input = r#"Start
+{"tool": "write_file", "args": {"content": "function() { return {a: 1, b: {c: 2}}; }", "file_path": "test.js"}}
+End"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Start\n\nEnd";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_string_escaping_in_json() {
+        reset_fixed_json_tool_state();
+
+        // Test JSON with escaped quotes and braces in strings
+        let input = r#"Text
+{"tool": "shell", "args": {"command": "echo \"Hello {world}\" > file.txt"}}
+More"#;
+
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Text\n\nMore";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_specification_compliance() {
+        reset_fixed_json_tool_state();
+
+        // Test the exact specification requirements:
+        // 1. Detect start with regex '\w*{\w*"tool"\w*:\w*"' on newline
+        // 2. Enter suppression mode and use brace counting
+        // 3. Elide only JSON between first '{' and last '}' (inclusive)
+        // 4. Return everything else
+
+        let input = "Before text\nSome more text\n{\"tool\": \"test\", \"args\": {}}\nAfter text\nMore after";
+        let result = fixed_filter_json_tool_calls(input);
+        let expected = "Before text\nSome more text\n\nAfter text\nMore after";
+        assert_eq!(result, expected);
+    }
+
+    #[test]
+    fn test_no_false_positives() {
+        reset_fixed_json_tool_state();
+
+        // Test that we don't incorrectly identify non-tool JSON as tool calls
+        let input = r#"Some text
+{"not_tool": "value", "other": "data"}
+More text"#;
+        let result = fixed_filter_json_tool_calls(input);
+        // Should pass through unchanged since it doesn't match the tool pattern
+        assert_eq!(result, input);
+    }
+
+    #[test]
+    fn test_partial_tool_patterns() {
+        reset_fixed_json_tool_state();
+
+        // Test patterns that look like tool calls but aren't complete
+        let test_cases = vec![
+            "Text\n{\"too\": \"value\"}",   // "too" not "tool"
+            "Text\n{\"tools\": \"value\"}", // "tools" not "tool"
+            "Text\n{\"tool\": }",           // Missing value after colon
+        ];
+
+        for input in test_cases {
+            reset_fixed_json_tool_state();
+            let result = fixed_filter_json_tool_calls(input);
+            // These should all pass through unchanged
+            assert_eq!(result, input, "Input should pass through: {}", input);
+        }
+    }
+
+    #[test]
+    fn test_streaming_edge_cases() {
+        reset_fixed_json_tool_state();
+
+        // Test streaming with very small chunks
+        let chunks = vec![
+            "Text\n", "{", "\"", "tool", "\"", ":", " ", "\"", "test", "\"", "}", "\nAfter",
+        ];
+
+        let mut results = Vec::new();
+        for chunk in chunks {
+            let result = fixed_filter_json_tool_calls(chunk);
+            results.push(result);
+        }
+
+        let final_result: String = results.join("");
+        // This test currently fails because the JSON is incomplete across chunks
+        // The function doesn't handle this edge case properly yet
+        let expected = "Text\n{\"tool\": \nAfter";
+        assert_eq!(final_result, expected);
+    }
+
+    #[test]
+    fn test_streaming_debug() {
+        reset_fixed_json_tool_state();
+
+        // Debug the exact failing case
+        let chunks = vec![
+            "Some text before\n",
+            "{\"tool\": \"",
+            "shell\", \"args\": {",
+            "\"command\": \"ls\"",
+            "}}\nText after",
+        ];
+
+        let mut results = Vec::new();
+        for (i, chunk) in chunks.iter().enumerate() {
+            let result = fixed_filter_json_tool_calls(chunk);
+            println!("Chunk {}: {:?} -> {:?}", i, chunk, result);
+            results.push(result);
+        }
+
+        let final_result: String = results.join("");
+        println!("Final result: {:?}", final_result);
+        println!("Expected: {:?}", "Some text before\n\nText after");
+
+        let expected = "Some text before\n\nText after";
+        assert_eq!(final_result, expected);
+    }
+}
--- a/crates/g3-core/src/lib.rs
+++ b/crates/g3-core/src/lib.rs
--- a/crates/g3-core/src/project.rs
+++ b/crates/g3-core/src/project.rs
@@ -0,0 +1,183 @@
+use anyhow::Result;
+use serde::{Deserialize, Serialize};
+use std::path::{Path, PathBuf};
+
+/// Represents a G3 project with workspace configuration
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct Project {
+    /// The workspace directory for the project
+    pub workspace_dir: PathBuf,
+    
+    /// Path to the requirements document (for autonomous mode)
+    pub requirements_path: Option<PathBuf>,
+    
+    /// Override requirements text (takes precedence over requirements_path)
+    pub requirements_text: Option<String>,
+    
+    /// Whether the project is in autonomous mode
+    pub autonomous: bool,
+    
+    /// Project name (derived from workspace directory name)
+    pub name: String,
+    
+    /// Session ID for tracking
+    pub session_id: Option<String>,
+}
+
+impl Project {
+    /// Create a new project with the given workspace directory
+    pub fn new(workspace_dir: PathBuf) -> Self {
+        let name = workspace_dir
+            .file_name()
+            .and_then(|n| n.to_str())
+            .unwrap_or("unnamed")
+            .to_string();
+        
+        Self {
+            workspace_dir,
+            requirements_path: None,
+            requirements_text: None,
+            autonomous: false,
+            name,
+            session_id: None,
+        }
+    }
+    
+    /// Create a project for autonomous mode
+    pub fn new_autonomous(workspace_dir: PathBuf) -> Result<Self> {
+        let mut project = Self::new(workspace_dir.clone());
+        project.autonomous = true;
+        
+        // Look for requirements.md in the workspace directory
+        let requirements_path = workspace_dir.join("requirements.md");
+        if requirements_path.exists() {
+            project.requirements_path = Some(requirements_path);
+        }
+        
+        Ok(project)
+    }
+    
+    /// Create a project for autonomous mode with requirements text override
+    pub fn new_autonomous_with_requirements(workspace_dir: PathBuf, requirements_text: String) -> Result<Self> {
+        let mut project = Self::new(workspace_dir.clone());
+        project.autonomous = true;
+        project.requirements_text = Some(requirements_text);
+        
+        // Don't look for requirements.md file when text is provided
+        // The text override takes precedence
+        
+        Ok(project)
+    }
+    
+    /// Set the workspace directory and update related paths
+    pub fn set_workspace(&mut self, workspace_dir: PathBuf) {
+        self.workspace_dir = workspace_dir.clone();
+        self.name = workspace_dir
+            .file_name()
+            .and_then(|n| n.to_str())
+            .unwrap_or("unnamed")
+            .to_string();
+        
+        // Update requirements path if in autonomous mode
+        if self.autonomous {
+            let requirements_path = workspace_dir.join("requirements.md");
+            if requirements_path.exists() {
+                self.requirements_path = Some(requirements_path);
+            }
+        }
+    }
+    
+    /// Get the workspace directory
+    pub fn workspace(&self) -> &Path {
+        &self.workspace_dir
+    }
+    
+    /// Check if requirements file exists
+    pub fn has_requirements(&self) -> bool {
+        // Has requirements if either text override is provided or requirements file exists
+        self.requirements_text.is_some() || self.requirements_path.is_some()
+    }
+    
+    /// Check if implementation files exist in the workspace
+    pub fn has_implementation_files(&self) -> bool {
+        self.check_dir_for_implementation_files(&self.workspace_dir)
+    }
+    
+    /// Recursively check a directory for implementation files
+    fn check_dir_for_implementation_files(&self, dir: &Path) -> bool {
+        // Common source file extensions
+        let extensions = vec![
+            "swift", "rs", "py", "js", "ts", "java", "cpp", "c",
+            "go", "rb", "php", "cs", "kt", "scala", "m", "h"
+        ];
+        
+        if let Ok(entries) = std::fs::read_dir(dir) {
+            for entry in entries.flatten() {
+                let path = entry.path();
+                
+                if path.is_file() {
+                    // Check if it's a source file
+                    if let Some(ext) = path.extension() {
+                        if let Some(ext_str) = ext.to_str() {
+                            if extensions.contains(&ext_str) {
+                                return true;
+                            }
+                        }
+                    }
+                } else if path.is_dir() {
+                    // Skip hidden directories and common non-source directories
+                    if let Some(name) = path.file_name().and_then(|n| n.to_str()) {
+                        if !name.starts_with('.') && name != "logs" && name != "target" && name != "node_modules" {
+                            // Recursively check subdirectories
+                            if self.check_dir_for_implementation_files(&path) {
+                                return true;
+                            }
+                        }
+                    }
+                }
+            }
+        }
+        false
+    }
+    
+    /// Read the requirements file content
+    pub fn read_requirements(&self) -> Result<Option<String>> {
+        // Prioritize requirements text override
+        if let Some(ref text) = self.requirements_text {
+            Ok(Some(text.clone()))
+        } else if let Some(ref path) = self.requirements_path {
+            // Fall back to reading from file
+            Ok(Some(std::fs::read_to_string(path)?))
+        } else {
+            Ok(None)
+        }
+    }
+    
+    /// Create the workspace directory if it doesn't exist
+    pub fn ensure_workspace_exists(&self) -> Result<()> {
+        if !self.workspace_dir.exists() {
+            std::fs::create_dir_all(&self.workspace_dir)?;
+        }
+        Ok(())
+    }
+    
+    /// Change to the workspace directory
+    pub fn enter_workspace(&self) -> Result<()> {
+        std::env::set_current_dir(&self.workspace_dir)?;
+        Ok(())
+    }
+    
+    /// Get the logs directory for the project
+    pub fn logs_dir(&self) -> PathBuf {
+        self.workspace_dir.join("logs")
+    }
+    
+    /// Ensure the logs directory exists
+    pub fn ensure_logs_dir(&self) -> Result<()> {
+        let logs_dir = self.logs_dir();
+        if !logs_dir.exists() {
+            std::fs::create_dir_all(&logs_dir)?;
+        }
+        Ok(())
+    }
+}
--- a/crates/g3-core/src/providers/mod.rs
+++ b/crates/g3-core/src/providers/mod.rs
@@ -1 +0,0 @@
-
--- a/crates/g3-core/src/task_result.rs
+++ b/crates/g3-core/src/task_result.rs
@@ -0,0 +1,168 @@
+use crate::ContextWindow;
+
+/// Result of a task execution containing both the response and the context window
+#[derive(Debug, Clone)]
+pub struct TaskResult {
+    /// The actual response content from the task execution
+    pub response: String,
+    /// The complete context window at the time of completion
+    pub context_window: ContextWindow,
+}
+
+impl TaskResult {
+    pub fn new(response: String, context_window: ContextWindow) -> Self {
+        Self {
+            response,
+            context_window,
+        }
+    }
+
+    /// Extract the final_output content from the response (for coach feedback in autonomous mode)
+    /// This looks for the complete final_output content, not just the last block
+    pub fn extract_final_output(&self) -> String {
+        // Remove any timing information at the end
+        let content_without_timing = if let Some(timing_pos) = self.response.rfind("\n⏱️") {
+            &self.response[..timing_pos]
+        } else {
+            &self.response
+        };
+
+        // Look for the final_output marker pattern
+        // The final_output content typically appears after the tool is called
+        // and is the substantive content that follows
+        
+        // First, try to find if there's a clear final_output section
+        // This would be the content after the last tool execution
+        if let Some(final_output_pos) = content_without_timing.rfind("final_output") {
+            // Find the content that follows the final_output call
+            // Skip past the tool call line and any immediate formatting
+            if let Some(content_start) = content_without_timing[final_output_pos..].find('\n') {
+                let start_pos = final_output_pos + content_start + 1;
+                let final_content = &content_without_timing[start_pos..];
+                
+                // Trim and return the complete content
+                let trimmed = final_content.trim();
+                if !trimmed.is_empty() {
+                    return trimmed.to_string();
+                }
+            }
+        }
+        
+        // Fallback to the original extract_last_block behavior if we can't find final_output
+        // This maintains backward compatibility
+        self.extract_last_block()
+    }
+
+    /// Extract the last block from the response (for coach feedback in autonomous mode)
+    /// This looks for the final_output content which is the last substantial block
+    pub fn extract_last_block(&self) -> String {
+        // Remove any timing information at the end
+        let content_without_timing = if let Some(timing_pos) = self.response.rfind("\n⏱️") {
+            &self.response[..timing_pos]
+        } else {
+            &self.response
+        };
+        
+        // Split by double newlines to find the last substantial block
+        let blocks: Vec<&str> = content_without_timing.split("\n\n").collect();
+        
+        // Find the last non-empty block that isn't just whitespace
+        blocks.iter()
+            .rev()
+            .find(|block| !block.trim().is_empty())
+            .map(|block| block.trim().to_string())
+            .unwrap_or_else(|| {
+                // Fallback: if we can't find a clear block, take the whole thing
+                content_without_timing.trim().to_string()
+            })
+    }
+
+    /// Check if the response contains an approval (for autonomous mode)
+    pub fn is_approved(&self) -> bool {
+        self.extract_final_output().contains("IMPLEMENTATION_APPROVED")
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_extract_last_block() {
+        // Test case 1: Response with timing info
+        let context_window = ContextWindow::new(1000);
+        let response_with_timing = "Some initial content\n\nFinal block content\n\n⏱️ 2.3s | 💭 1.2s".to_string();
+        let result = TaskResult::new(response_with_timing, context_window.clone());
+        assert_eq!(result.extract_last_block(), "Final block content");
+        
+        // Test case 2: Response without timing
+        let response_no_timing = "Some initial content\n\nFinal block content".to_string();
+        let result = TaskResult::new(response_no_timing, context_window.clone());
+        assert_eq!(result.extract_last_block(), "Final block content");
+        
+        // Test case 3: Response with IMPLEMENTATION_APPROVED
+        let response_approved = "Some content\n\nIMPLEMENTATION_APPROVED".to_string();
+        let result = TaskResult::new(response_approved, context_window.clone());
+        assert!(result.is_approved());
+        
+        // Test case 4: Response without approval
+        let response_not_approved = "Some content\n\nNeeds more work".to_string();
+        let result = TaskResult::new(response_not_approved, context_window);
+        assert!(!result.is_approved());
+    }
+
+    #[test]
+    fn test_extract_last_block_edge_cases() {
+        let context_window = ContextWindow::new(1000);
+        
+        // Test empty response
+        let empty_response = "".to_string();
+        let result = TaskResult::new(empty_response, context_window.clone());
+        assert_eq!(result.extract_last_block(), "");
+        
+        // Test single block
+        let single_block = "Just one block".to_string();
+        let result = TaskResult::new(single_block, context_window.clone());
+        assert_eq!(result.extract_last_block(), "Just one block");
+        
+        // Test multiple empty blocks
+        let multiple_empty = "\n\n\n\nSome content\n\n\n\n".to_string();
+        let result = TaskResult::new(multiple_empty, context_window);
+        assert_eq!(result.extract_last_block(), "Some content");
+    }
+
+    #[test]
+    fn test_extract_final_output() {
+        let context_window = ContextWindow::new(1000);
+        
+        // Test case 1: Response with final_output tool call
+        let response_with_final_output = "Analyzing files...\n\nCalling final_output\n\nThis is the complete feedback\nwith multiple lines\nand important details\n\n⏱️ 2.3s".to_string();
+        let result = TaskResult::new(response_with_final_output, context_window.clone());
+        assert_eq!(result.extract_final_output(), "This is the complete feedback\nwith multiple lines\nand important details");
+        
+        // Test case 2: Response with IMPLEMENTATION_APPROVED in final_output
+        let response_approved = "Review complete\n\nfinal_output called\n\nIMPLEMENTATION_APPROVED".to_string();
+        let result = TaskResult::new(response_approved, context_window.clone());
+        assert_eq!(result.extract_final_output(), "IMPLEMENTATION_APPROVED");
+        assert!(result.is_approved());
+        
+        // Test case 3: Response with detailed feedback in final_output
+        let response_feedback = "Checking implementation...\n\nfinal_output\n\nThe following issues need to be addressed:\n1. Missing error handling in main.rs\n2. Tests are not comprehensive\n3. Documentation needs improvement\n\nPlease fix these issues.".to_string();
+        let result = TaskResult::new(response_feedback, context_window.clone());
+        let extracted = result.extract_final_output();
+        assert!(extracted.contains("The following issues need to be addressed:"));
+        assert!(extracted.contains("1. Missing error handling"));
+        assert!(extracted.contains("Please fix these issues."));
+        assert!(!result.is_approved());
+        
+        // Test case 4: Response without final_output (fallback to extract_last_block)
+        let response_no_final_output = "Some analysis\n\nFinal thoughts here".to_string();
+        let result = TaskResult::new(response_no_final_output, context_window.clone());
+        assert_eq!(result.extract_final_output(), "Final thoughts here");
+        
+        // Test case 5: Empty response
+        let empty_response = "".to_string();
+        let result = TaskResult::new(empty_response, context_window);
+        assert_eq!(result.extract_final_output(), "");
+    }
+}
--- a/crates/g3-core/src/task_result_comprehensive_tests.rs
+++ b/crates/g3-core/src/task_result_comprehensive_tests.rs
@@ -0,0 +1,247 @@
+use crate::{ContextWindow, TaskResult};
+use g3_providers::{Message, MessageRole};
+use std::sync::Arc;
+
+#[test]
+fn test_task_result_basic_functionality() {
+    // Create a context window with some messages
+    let mut context = ContextWindow::new(10000);
+    context.add_message(Message {
+        role: MessageRole::User,
+        content: "Test message 1".to_string(),
+    });
+    context.add_message(Message {
+        role: MessageRole::Assistant,
+        content: "Response 1".to_string(),
+    });
+    
+    // Create a TaskResult
+    let response = "This is the response\n\nFinal output block".to_string();
+    let result = TaskResult::new(response.clone(), context.clone());
+    
+    // Test basic properties
+    assert_eq!(result.response, response);
+    assert_eq!(result.context_window.conversation_history.len(), 2);
+    assert_eq!(result.context_window.total_tokens, 10000);
+}
+
+#[test]
+fn test_extract_last_block_various_formats() {
+    let context = ContextWindow::new(1000);
+    
+    // Test 1: Standard format with multiple blocks
+    let response1 = "First block\n\nSecond block\n\nThird block".to_string();
+    let result1 = TaskResult::new(response1, context.clone());
+    assert_eq!(result1.extract_last_block(), "Third block");
+    
+    // Test 2: With timing information
+    let response2 = "Content\n\nFinal block\n\n⏱️ 2.3s | 💭 1.2s".to_string();
+    let result2 = TaskResult::new(response2, context.clone());
+    assert_eq!(result2.extract_last_block(), "Final block");
+    
+    // Test 3: Single line response
+    let response3 = "Single line response".to_string();
+    let result3 = TaskResult::new(response3, context.clone());
+    assert_eq!(result3.extract_last_block(), "Single line response");
+    
+    // Test 4: Empty response
+    let response4 = "".to_string();
+    let result4 = TaskResult::new(response4, context.clone());
+    assert_eq!(result4.extract_last_block(), "");
+    
+    // Test 5: Only whitespace
+    let response5 = "\n\n\n   \n\n".to_string();
+    let result5 = TaskResult::new(response5, context.clone());
+    assert_eq!(result5.extract_last_block(), "");
+    
+    // Test 6: Multiple blocks with empty ones
+    let response6 = "First\n\n\n\n\n\nLast block here".to_string();
+    let result6 = TaskResult::new(response6, context.clone());
+    assert_eq!(result6.extract_last_block(), "Last block here");
+}
+
+#[test]
+fn test_is_approved_detection() {
+    let context = ContextWindow::new(1000);
+    
+    // Test approved cases
+    let approved_responses = vec![
+        "Analysis complete\n\nIMPLEMENTATION_APPROVED",
+        "Some content\n\nThe implementation is good. IMPLEMENTATION_APPROVED",
+        "IMPLEMENTATION_APPROVED",
+        "Review done\n\n✅ IMPLEMENTATION_APPROVED - All tests pass",
+    ];
+    
+    for response in approved_responses {
+        let result = TaskResult::new(response.to_string(), context.clone());
+        assert!(result.is_approved(), "Failed to detect approval in: {}", response);
+    }
+    
+    // Test not approved cases
+    let not_approved_responses = vec![
+        "Needs more work",
+        "Implementation needs fixes",
+        "IMPLEMENTATION_REJECTED",
+        "Almost there but not APPROVED",
+        "",
+    ];
+    
+    for response in not_approved_responses {
+        let result = TaskResult::new(response.to_string(), context.clone());
+        assert!(!result.is_approved(), "Incorrectly detected approval in: {}", response);
+    }
+}
+
+#[test]
+fn test_context_window_preservation() {
+    // Create a context window with specific state
+    let mut context = ContextWindow::new(5000);
+    context.used_tokens = 1234;
+    
+    // Add some messages
+    for i in 0..5 {
+        context.add_message(Message {
+            role: if i % 2 == 0 { MessageRole::User } else { MessageRole::Assistant },
+            content: format!("Message {}", i),
+        });
+    }
+    
+    // Create TaskResult
+    let result = TaskResult::new("Response".to_string(), context.clone());
+    
+    // Verify context is preserved
+    assert_eq!(result.context_window.total_tokens, 5000);
+    assert!(result.context_window.used_tokens > 1234); // Should have increased
+    assert_eq!(result.context_window.conversation_history.len(), 5);
+    
+    // Verify messages are preserved correctly
+    for i in 0..5 {
+        let is_user = matches!(result.context_window.conversation_history[i].role, MessageRole::User);
+        let expected_is_user = i % 2 == 0;
+        assert_eq!(is_user, expected_is_user, "Message {} has wrong role", i);
+        assert_eq!(result.context_window.conversation_history[i].content, format!("Message {}", i));
+    }
+}
+
+#[test]
+fn test_coach_feedback_extraction_scenarios() {
+    let context = ContextWindow::new(1000);
+    
+    // Scenario 1: Coach feedback with file operations and analysis
+    let coach_response = r#"Reading file: src/main.rs
+📄 File content (23 lines):
+fn main() {
+    println!("Hello");
+}
+
+Analyzing implementation...
+
+The implementation needs the following fixes:
+1. Add error handling
+2. Implement missing functions
+3. Add tests"#;
+    
+    let result = TaskResult::new(coach_response.to_string(), context.clone());
+    let feedback = result.extract_last_block();
+    assert!(feedback.contains("Add error handling"));
+    assert!(feedback.contains("Implement missing functions"));
+    assert!(feedback.contains("Add tests"));
+    
+    // Scenario 2: Coach approval
+    let approval_response = r#"Checking compilation...
+✅ Build successful
+
+Running tests...
+✅ All tests pass
+
+IMPLEMENTATION_APPROVED"#;
+    
+    let result = TaskResult::new(approval_response.to_string(), context.clone());
+    assert!(result.is_approved());
+    assert_eq!(result.extract_last_block(), "IMPLEMENTATION_APPROVED");
+    
+    // Scenario 3: Complex feedback with timing
+    let complex_response = r#"Tool execution log...
+
+Analysis complete.
+
+The following issues were found:
+- Memory leak in process_data()
+- Missing input validation
+
+⏱️ 5.2s | 💭 2.1s"#;
+    
+    let result = TaskResult::new(complex_response.to_string(), context.clone());
+    let feedback = result.extract_last_block();
+    assert!(feedback.contains("Memory leak"));
+    assert!(feedback.contains("Missing input validation"));
+    assert!(!feedback.contains("⏱️")); // Timing should be stripped
+}
+
+#[test]
+fn test_edge_cases_and_special_characters() {
+    let context = ContextWindow::new(1000);
+    
+    // Test with special characters and emojis
+    let response_with_emojis = "First part 🚀\n\n✅ Final part with emojis 🎉".to_string();
+    let result = TaskResult::new(response_with_emojis, context.clone());
+    assert_eq!(result.extract_last_block(), "✅ Final part with emojis 🎉");
+    
+    // Test with code blocks
+    let response_with_code = "Explanation\n\n```rust\nfn main() {}\n```\n\nFinal comment".to_string();
+    let result = TaskResult::new(response_with_code, context.clone());
+    assert_eq!(result.extract_last_block(), "Final comment");
+    
+    // Test with mixed newlines
+    let mixed_newlines = "Part 1\r\n\r\nPart 2\n\nPart 3".to_string();
+    let result = TaskResult::new(mixed_newlines, context.clone());
+    assert_eq!(result.extract_last_block(), "Part 3");
+}
+
+#[test]
+fn test_large_response_handling() {
+    let context = ContextWindow::new(100000);
+    
+    // Create a large response
+    let mut large_response = String::new();
+    for i in 0..100 {
+        large_response.push_str(&format!("Block {} with some content\n\n", i));
+    }
+    large_response.push_str("This is the final block after 100 other blocks");
+    
+    let result = TaskResult::new(large_response, context);
+    assert_eq!(result.extract_last_block(), "This is the final block after 100 other blocks");
+}
+
+#[test]
+fn test_concurrent_access() {
+    use std::thread;
+    
+    let context = ContextWindow::new(1000);
+    let result = Arc::new(TaskResult::new(
+        "Concurrent test\n\nFinal block".to_string(),
+        context,
+    ));
+    
+    let mut handles = vec![];
+    
+    // Spawn multiple threads to access the TaskResult
+    for _ in 0..10 {
+        let result_clone = Arc::clone(&result);
+        let handle = thread::spawn(move || {
+            // Each thread extracts the last block
+            let block = result_clone.extract_last_block();
+            assert_eq!(block, "Final block");
+            
+            // Check approval status
+            assert!(!result_clone.is_approved());
+        });
+        handles.push(handle);
+    }
+    
+    // Wait for all threads to complete
+    for handle in handles {
+        handle.join().unwrap();
+    }
+}
+
--- a/crates/g3-core/src/task_result_tests.rs
+++ b/crates/g3-core/src/task_result_tests.rs
@@ -0,0 +1,48 @@
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_extract_last_block() {
+        // Test case 1: Response with timing info
+        let context_window = ContextWindow::new(1000);
+        let response_with_timing = "Some initial content\n\nFinal block content\n\n⏱️ 2.3s | 💭 1.2s".to_string();
+        let result = TaskResult::new(response_with_timing, context_window.clone());
+        assert_eq!(result.extract_last_block(), "Final block content");
+        
+        // Test case 2: Response without timing
+        let response_no_timing = "Some initial content\n\nFinal block content".to_string();
+        let result = TaskResult::new(response_no_timing, context_window.clone());
+        assert_eq!(result.extract_last_block(), "Final block content");
+        
+        // Test case 3: Response with IMPLEMENTATION_APPROVED
+        let response_approved = "Some content\n\nIMPLEMENTATION_APPROVED".to_string();
+        let result = TaskResult::new(response_approved, context_window.clone());
+        assert!(result.is_approved());
+        
+        // Test case 4: Response without approval
+        let response_not_approved = "Some content\n\nNeeds more work".to_string();
+        let result = TaskResult::new(response_not_approved, context_window);
+        assert!(!result.is_approved());
+    }
+
+    #[test]
+    fn test_extract_last_block_edge_cases() {
+        let context_window = ContextWindow::new(1000);
+        
+        // Test empty response
+        let empty_response = "".to_string();
+        let result = TaskResult::new(empty_response, context_window.clone());
+        assert_eq!(result.extract_last_block(), "");
+        
+        // Test single block
+        let single_block = "Just one block".to_string();
+        let result = TaskResult::new(single_block, context_window.clone());
+        assert_eq!(result.extract_last_block(), "Just one block");
+        
+        // Test multiple empty blocks
+        let multiple_empty = "\n\n\n\nSome content\n\n\n\n".to_string();
+        let result = TaskResult::new(multiple_empty, context_window);
+        assert_eq!(result.extract_last_block(), "Some content");
+    }
+}
--- a/crates/g3-core/src/tilde_expansion_tests.rs
+++ b/crates/g3-core/src/tilde_expansion_tests.rs
@@ -0,0 +1,36 @@
+#[cfg(test)]
+mod tilde_expansion_tests {
+    use std::env;
+
+    #[test]
+    fn test_tilde_expansion() {
+        // Test that shellexpand works
+        let path_with_tilde = "~/test.txt";
+        let expanded = shellexpand::tilde(path_with_tilde);
+        
+        // Get the actual home directory
+        let home = env::var("HOME").expect("HOME environment variable not set");
+        
+        // Verify expansion happened
+        assert_eq!(expanded.as_ref(), format!("{}/test.txt", home));
+        assert!(!expanded.contains("~"));
+    }
+
+    #[test]
+    fn test_tilde_expansion_with_subdirs() {
+        let path_with_tilde = "~/Documents/test.txt";
+        let expanded = shellexpand::tilde(path_with_tilde);
+        
+        let home = env::var("HOME").expect("HOME environment variable not set");
+        
+        assert_eq!(expanded.as_ref(), format!("{}/Documents/test.txt", home));
+    }
+
+    #[test]
+    fn test_no_tilde_unchanged() {
+        let path_without_tilde = "/absolute/path/test.txt";
+        let expanded = shellexpand::tilde(path_without_tilde);
+        
+        assert_eq!(expanded.as_ref(), path_without_tilde);
+    }
+}
--- a/crates/g3-core/src/ui_writer.rs
+++ b/crates/g3-core/src/ui_writer.rs
@@ -0,0 +1,74 @@
+/// Interface for UI output operations
+/// This trait abstracts all UI operations to allow different implementations
+/// (console, TUI, web, etc.) without coupling the core logic to specific output methods.
+pub trait UiWriter: Send + Sync {
+    /// Print a simple message
+    fn print(&self, message: &str);
+    
+    /// Print a message with a newline
+    fn println(&self, message: &str);
+    
+    /// Print without newline (for progress indicators)
+    fn print_inline(&self, message: &str);
+    
+    /// Print a system prompt section
+    fn print_system_prompt(&self, prompt: &str);
+    
+    /// Print a context window status message
+    fn print_context_status(&self, message: &str);
+    
+    /// Print a tool execution header
+    fn print_tool_header(&self, tool_name: &str);
+    
+    /// Print a tool argument
+    fn print_tool_arg(&self, key: &str, value: &str);
+    
+    /// Print tool output header
+    fn print_tool_output_header(&self);
+    
+    /// Update the current tool output line (replaces previous line)
+    fn update_tool_output_line(&self, line: &str);
+    
+    /// Print a tool output line
+    fn print_tool_output_line(&self, line: &str);
+    
+    /// Print tool output summary (when output is truncated)
+    fn print_tool_output_summary(&self, hidden_count: usize);
+    
+    /// Print tool execution timing
+    fn print_tool_timing(&self, duration_str: &str);
+    
+    /// Print the agent prompt indicator
+    fn print_agent_prompt(&self);
+    
+    /// Print agent response inline (for streaming)
+    fn print_agent_response(&self, content: &str);
+    
+    /// Notify that an SSE event was received (including pings)
+    fn notify_sse_received(&self);
+    
+    /// Flush any buffered output
+    fn flush(&self);
+}
+
+/// A no-op implementation for when UI output is not needed
+pub struct NullUiWriter;
+
+impl UiWriter for NullUiWriter {
+    fn print(&self, _message: &str) {}
+    fn println(&self, _message: &str) {}
+    fn print_inline(&self, _message: &str) {}
+    fn print_system_prompt(&self, _prompt: &str) {}
+    fn print_context_status(&self, _message: &str) {}
+    fn print_tool_header(&self, _tool_name: &str) {}
+    fn print_tool_arg(&self, _key: &str, _value: &str) {}
+    fn print_tool_output_header(&self) {}
+    fn update_tool_output_line(&self, _line: &str) {}
+    fn print_tool_output_line(&self, _line: &str) {}
+    fn print_tool_output_summary(&self, _hidden_count: usize) {}
+    fn print_tool_timing(&self, _duration_str: &str) {}
+    fn print_agent_prompt(&self) {}
+    fn print_agent_response(&self, _content: &str) {}
+    fn notify_sse_received(&self) {}
+    fn flush(&self) {}
+}
--- a/crates/g3-core/tests/test_context_thinning.rs
+++ b/crates/g3-core/tests/test_context_thinning.rs
@@ -0,0 +1,157 @@
+use g3_core::ContextWindow;
+use g3_providers::{Message, MessageRole};
+
+#[test]
+fn test_thinning_thresholds() {
+    let mut context = ContextWindow::new(10000);
+    
+    // At 0%, should not thin
+    assert!(!context.should_thin());
+    
+    // Simulate reaching 50% usage
+    context.used_tokens = 5000;
+    assert!(context.should_thin());
+    
+    // After thinning at 50%, should not thin again until next threshold
+    context.last_thinning_percentage = 50;
+    assert!(!context.should_thin());
+    
+    // At 60%, should thin again
+    context.used_tokens = 6000;
+    assert!(context.should_thin());
+    
+    // After thinning at 60%, should not thin
+    context.last_thinning_percentage = 60;
+    assert!(!context.should_thin());
+    
+    // At 70%, should thin
+    context.used_tokens = 7000;
+    assert!(context.should_thin());
+    
+    // At 80%, should thin
+    context.last_thinning_percentage = 70;
+    context.used_tokens = 8000;
+    assert!(context.should_thin());
+    
+    // After 80%, should not thin (compaction takes over)
+    context.last_thinning_percentage = 80;
+    context.used_tokens = 8500;
+    assert!(!context.should_thin());
+}
+
+#[test]
+fn test_thin_context_basic() {
+    let mut context = ContextWindow::new(10000);
+    
+    // Add some messages to the first third
+    for i in 0..9 {
+        if i % 2 == 0 {
+            context.add_message(Message {
+                role: MessageRole::Assistant,
+                content: format!("Assistant message {}", i),
+            });
+        } else {
+            // Add tool results with varying sizes
+            let content = if i == 1 {
+                // Large tool result (> 1000 chars)
+                format!("Tool result: {}", "x".repeat(1500))
+            } else if i == 3 {
+                // Another large tool result
+                format!("Tool result: {}", "y".repeat(2000))
+            } else {
+                // Small tool result (< 1000 chars)
+                format!("Tool result: small result {}", i)
+            };
+            
+            context.add_message(Message {
+                role: MessageRole::User,
+                content,
+            });
+        }
+    }
+    
+    // Trigger thinning at 50%
+    context.used_tokens = 5000;
+    let summary = context.thin_context();
+    
+    println!("Thinning summary: {}", summary);
+    
+    // Should have thinned at least 1 large tool result in the first third
+    assert!(summary.contains("1 tool result"), "Summary was: {}", summary);
+    assert!(summary.contains("50%"));
+    
+    // Check that the large tool results were replaced
+    let first_third_end = context.conversation_history.len() / 3;
+    for i in 0..first_third_end {
+        if let Some(msg) = context.conversation_history.get(i) {
+            if matches!(msg.role, MessageRole::User) && msg.content.starts_with("Tool result:") {
+                if msg.content.len() > 1000 {
+                    panic!("Found un-thinned large tool result at index {}", i);
+                }
+            }
+        }
+    }
+}
+
+#[test]
+fn test_thin_context_no_large_results() {
+    let mut context = ContextWindow::new(10000);
+    
+    // Add only small messages
+    for i in 0..9 {
+        context.add_message(Message {
+            role: MessageRole::User,
+            content: format!("Tool result: small {}", i),
+        });
+    }
+    
+    context.used_tokens = 5000;
+    let summary = context.thin_context();
+    
+    // Should report no large results found
+    assert!(summary.contains("no large tool results found"));
+}
+
+#[test]
+fn test_thin_context_only_affects_first_third() {
+    let mut context = ContextWindow::new(10000);
+    
+    // Add 12 messages (first third = 4 messages)
+    for i in 0..12 {
+        let content = if i % 2 == 1 {
+            // All odd indices are large tool results
+            format!("Tool result: {}", "x".repeat(1500))
+        } else {
+            format!("Assistant message {}", i)
+        };
+        
+        let role = if i % 2 == 1 {
+            MessageRole::User
+        } else {
+            MessageRole::Assistant
+        };
+        
+        context.add_message(Message { role, content });
+    }
+    
+    context.used_tokens = 5000;
+    let summary = context.thin_context();
+    
+    // First third is 4 messages (indices 0-3), so only indices 1 and 3 should be thinned
+    // That's 2 tool results
+    assert!(summary.contains("2 tool results"));
+    
+    // Check that messages after the first third are NOT thinned
+    let first_third_end = context.conversation_history.len() / 3;
+    for i in first_third_end..context.conversation_history.len() {
+        if let Some(msg) = context.conversation_history.get(i) {
+            if matches!(msg.role, MessageRole::User) && msg.content.starts_with("Tool result:") {
+                // These should still be large (not thinned)
+                if i % 2 == 1 {
+                    assert!(msg.content.len() > 1000, 
+                        "Message at index {} should not have been thinned", i);
+                }
+            }
+        }
+    }
+}
--- a/crates/g3-core/tests/test_token_counting.rs
+++ b/crates/g3-core/tests/test_token_counting.rs
@@ -0,0 +1,94 @@
+use g3_core::ContextWindow;
+use g3_providers::Usage;
+
+#[test]
+fn test_token_accumulation() {
+    let mut window = ContextWindow::new(10000);
+    
+    // First API call: 100 prompt + 50 completion = 150 total
+    let usage1 = Usage {
+        prompt_tokens: 100,
+        completion_tokens: 50,
+        total_tokens: 150,
+    };
+    window.update_usage_from_response(&usage1);
+    assert_eq!(window.used_tokens, 150, "First call should have 150 tokens");
+    assert_eq!(window.cumulative_tokens, 150, "Cumulative should be 150");
+
+    // Second API call: 200 prompt + 75 completion = 275 total
+    let usage2 = Usage {
+        prompt_tokens: 200,
+        completion_tokens: 75,
+        total_tokens: 275,
+    };
+    window.update_usage_from_response(&usage2);
+    assert_eq!(window.used_tokens, 425, "Second call should accumulate to 425 tokens");
+    assert_eq!(window.cumulative_tokens, 425, "Cumulative should be 425");
+
+    // Third API call with SMALLER token count: 50 prompt + 25 completion = 75 total
+    let usage3 = Usage {
+        prompt_tokens: 50,
+        completion_tokens: 25,
+        total_tokens: 75,
+    };
+    window.update_usage_from_response(&usage3);
+    assert_eq!(window.used_tokens, 500, "Third call should accumulate to 500 tokens");
+    assert_eq!(window.cumulative_tokens, 500, "Cumulative should be 500");
+    
+    // Verify tokens never decrease
+    assert!(window.used_tokens >= 425, "Token count should never decrease!");
+}
+
+#[test]
+fn test_add_streaming_tokens() {
+    let mut window = ContextWindow::new(10000);
+    
+    // Add some streaming tokens
+    window.add_streaming_tokens(100);
+    assert_eq!(window.used_tokens, 100);
+    assert_eq!(window.cumulative_tokens, 100);
+    
+    // Add more
+    window.add_streaming_tokens(50);
+    assert_eq!(window.used_tokens, 150);
+    assert_eq!(window.cumulative_tokens, 150);
+    
+    // Now update from provider response
+    let usage = Usage {
+        prompt_tokens: 80,
+        completion_tokens: 40,
+        total_tokens: 120,
+    };
+    window.update_usage_from_response(&usage);
+    
+    // Should ADD to existing, not replace
+    assert_eq!(window.used_tokens, 270, "Should add 120 to existing 150");
+    assert_eq!(window.cumulative_tokens, 270);
+}
+
+#[test]
+fn test_percentage_calculation() {
+    let mut window = ContextWindow::new(1000);
+    
+    // Add tokens via provider response
+    let usage = Usage {
+        prompt_tokens: 150,
+        completion_tokens: 100,
+        total_tokens: 250,
+    };
+    window.update_usage_from_response(&usage);
+    
+    assert_eq!(window.percentage_used(), 25.0);
+    assert_eq!(window.remaining_tokens(), 750);
+    
+    // Add more tokens
+    let usage2 = Usage {
+        prompt_tokens: 300,
+        completion_tokens: 200,
+        total_tokens: 500,
+    };
+    window.update_usage_from_response(&usage2);
+    
+    assert_eq!(window.percentage_used(), 75.0);
+    assert_eq!(window.remaining_tokens(), 250);
+}
--- a/crates/g3-execution/Cargo.toml
+++ b/crates/g3-execution/Cargo.toml
@@ -7,6 +7,7 @@ description = "Code execution engine for G3 AI agent"
 [dependencies]
 tokio = { workspace = true }
 anyhow = { workspace = true }
+futures = "0.3"
 thiserror = { workspace = true }
 tracing = { workspace = true }
 regex = "1.0"
--- a/crates/g3-execution/src/lib.rs
+++ b/crates/g3-execution/src/lib.rs
@@ -203,3 +203,82 @@ impl Default for CodeExecutor {
        Self::new()
    }
 }
+
+/// Trait for receiving streaming output from command execution
+pub trait OutputReceiver: Send + Sync {
+    /// Called when a new line of output is available
+    fn on_output_line(&self, line: &str);
+}
+
+impl CodeExecutor {
+    /// Execute bash command with streaming output
+    pub async fn execute_bash_streaming<R: OutputReceiver>(
+        &self, 
+        code: &str, 
+        receiver: &R
+    ) -> Result<ExecutionResult> {
+        use std::process::Stdio;
+        use tokio::io::{AsyncBufReadExt, BufReader};
+        use tokio::process::Command as TokioCommand;
+        
+        let mut child = TokioCommand::new("bash")
+            .arg("-c")
+            .arg(code)
+            .stdout(Stdio::piped())
+            .stderr(Stdio::piped())
+            .spawn()?;
+        
+        let stdout = child.stdout.take().unwrap();
+        let stderr = child.stderr.take().unwrap();
+        
+        let stdout_reader = BufReader::new(stdout);
+        let stderr_reader = BufReader::new(stderr);
+        
+        let mut stdout_lines = stdout_reader.lines();
+        let mut stderr_lines = stderr_reader.lines();
+        
+        let mut stdout_output = Vec::new();
+        let mut stderr_output = Vec::new();
+        
+        // Read output lines as they come
+        loop {
+            tokio::select! {
+                line = stdout_lines.next_line() => {
+                    match line {
+                        Ok(Some(line)) => {
+                            receiver.on_output_line(&line);
+                            stdout_output.push(line);
+                        }
+                        Ok(None) => break, // EOF
+                        Err(e) => {
+                            error!("Error reading stdout: {}", e);
+                            break;
+                        }
+                    }
+                }
+                line = stderr_lines.next_line() => {
+                    match line {
+                        Ok(Some(line)) => {
+                            receiver.on_output_line(&format!("{}", line));
+                            stderr_output.push(line);
+                        }
+                        Ok(None) => {}, // stderr EOF, continue
+                        Err(e) => {
+                            error!("Error reading stderr: {}", e);
+                        }
+                    }
+                }
+                else => break
+            }
+        }
+        
+        let status = child.wait().await?;
+        
+        Ok(ExecutionResult {
+            stdout: stdout_output.join("\n"),
+            stderr: stderr_output.join("\n"),
+            exit_code: status.code().unwrap_or(-1),
+            success: status.success(),
+        })
+    }
+}
--- a/crates/g3-providers/Cargo.toml
+++ b/crates/g3-providers/Cargo.toml
@@ -16,3 +16,16 @@ async-trait = "0.1"
 tokio-stream = "0.1"
 futures-util = "0.3"
 bytes = "1.0"
+# OAuth dependencies
+axum = "0.7"
+base64 = "0.22"
+chrono = { version = "0.4", features = ["serde"] }
+sha2 = "0.10"
+url = "2.5"
+webbrowser = "1.0"
+nanoid = "0.4"
+serde_urlencoded = "0.7"
+tokio-util = "0.7"
+dirs = "5.0"
+llama_cpp = { version = "0.3.2", features = ["metal"] }
+shellexpand = "3.1"
--- a/crates/g3-providers/src/anthropic.rs
+++ b/crates/g3-providers/src/anthropic.rs
@@ -41,6 +41,7 @@
 //!         max_tokens: Some(1000),
 //!         temperature: Some(0.7),
 //!         stream: false,
+//!         tools: None,
 //!     };
 //!
 //!     // Get a completion
@@ -74,6 +75,7 @@
 //!         max_tokens: Some(1000),
 //!         temperature: Some(0.7),
 //!         stream: true,
+//!         tools: None,
 //!     };
 //!
 //!     let mut stream = provider.stream(request).await?;
@@ -194,20 +196,6 @@ impl AnthropicProvider {
            .collect()
    }

-    fn convert_anthropic_tool_calls(&self, content: &[AnthropicContent]) -> Vec<ToolCall> {
-        content
-            .iter()
-            .filter_map(|c| match c {
-                AnthropicContent::ToolUse { id, name, input } => Some(ToolCall {
-                    id: id.clone(),
-                    tool: name.clone(),
-                    args: input.clone(),
-                }),
-                _ => None,
-            })
-            .collect()
-    }
-
    fn convert_messages(&self, messages: &[Message]) -> Result<(Option<String>, Vec<AnthropicMessage>)> {
        let mut system_message = None;
        let mut anthropic_messages = Vec::new();
@@ -281,26 +269,42 @@ impl AnthropicProvider {
        &self,
        mut stream: impl futures_util::Stream<Item = reqwest::Result<Bytes>> + Unpin,
        tx: mpsc::Sender<Result<CompletionChunk>>,
-    ) {
+    ) -> Option<Usage> {
        let mut buffer = String::new();
        let mut current_tool_calls: Vec<ToolCall> = Vec::new();
        let mut partial_tool_json = String::new(); // Accumulate partial JSON for tool calls
+        let mut accumulated_usage: Option<Usage> = None;
+        let mut byte_buffer = Vec::new(); // Buffer for incomplete UTF-8 sequences
        
        while let Some(chunk_result) = stream.next().await {
            match chunk_result {
                Ok(chunk) => {
-                    let chunk_str = match std::str::from_utf8(&chunk) {
-                        Ok(s) => s,
+                    // Append new bytes to our buffer
+                    byte_buffer.extend_from_slice(&chunk);
+                    
+                    // Try to convert the entire buffer to UTF-8
+                    let chunk_str = match std::str::from_utf8(&byte_buffer) {
+                        Ok(s) => {
+                            // Successfully converted entire buffer, clear it and use the string
+                            let result = s.to_string();
+                            byte_buffer.clear();
+                            result
+                        }
                        Err(e) => {
-                            error!("Invalid UTF-8 in stream chunk: {}", e);
-                            let _ = tx
-                                .send(Err(anyhow!("Invalid UTF-8 in stream chunk: {}", e)))
-                                .await;
-                            return;
+                            // Check if this is an incomplete sequence at the end
+                            let valid_up_to = e.valid_up_to();
+                            if valid_up_to > 0 {
+                                // We have some valid UTF-8, extract it and keep the rest for next iteration
+                                let valid_bytes = byte_buffer.drain(..valid_up_to).collect::<Vec<_>>();
+                                std::str::from_utf8(&valid_bytes).unwrap().to_string()
+                            } else {
+                                // No valid UTF-8 at all, skip this chunk and continue
+                                continue;
+                            }
                        }
                    };

-                    buffer.push_str(chunk_str);
+                    buffer.push_str(&chunk_str);

                    // Process complete lines
                    while let Some(line_end) = buffer.find('\n') {
@@ -318,20 +322,34 @@ impl AnthropicProvider {
                                let final_chunk = CompletionChunk {
                                    content: String::new(),
                                    finished: true,
+                                    usage: accumulated_usage.clone(),
                                    tool_calls: if current_tool_calls.is_empty() { None } else { Some(current_tool_calls.clone()) },
                                };
                                if tx.send(Ok(final_chunk)).await.is_err() {
                                    debug!("Receiver dropped, stopping stream");
                                }
-                                return;
+                                return accumulated_usage;
                            }

                            debug!("Raw Claude API JSON: {}", data);

                            match serde_json::from_str::<AnthropicStreamEvent>(data) {
                                Ok(event) => {
-                                    debug!("Parsed event: {:?}", event);
+                                    debug!("Parsed event type: {}, event: {:?}", event.event_type, event);
                                    match event.event_type.as_str() {
+                                        "message_start" => {
+                                            // Extract usage data from message_start event
+                                            if let Some(message) = event.message {
+                                                if let Some(usage) = message.usage {
+                                                    accumulated_usage = Some(Usage {
+                                                        prompt_tokens: usage.input_tokens,
+                                                        completion_tokens: usage.output_tokens,
+                                                        total_tokens: usage.input_tokens + usage.output_tokens,
+                                                    });
+                                                    debug!("Captured usage from message_start: {:?}", accumulated_usage);
+                                                }
+                                            }
+                                        }
                                        "content_block_start" => {
                                            debug!("Received content_block_start event: {:?}", event);
                                            if let Some(content_block) = event.content_block {
@@ -354,11 +372,12 @@ impl AnthropicProvider {
                                                            let chunk = CompletionChunk {
                                                                content: String::new(),
                                                                finished: false,
+                                                                usage: None,
                                                                tool_calls: Some(vec![tool_call]),
                                                            };
                                                            if tx.send(Ok(chunk)).await.is_err() {
                                                                debug!("Receiver dropped, stopping stream");
-                                                                return;
+                                                                return accumulated_usage;
                                                            }
                                                        } else {
                                                            // Arguments are empty, we'll accumulate them from partial_json
@@ -380,11 +399,12 @@ impl AnthropicProvider {
                                                    let chunk = CompletionChunk {
                                                        content: text,
                                                        finished: false,
+                                                        usage: None,
                                                        tool_calls: None,
                                                    };
                                                    if tx.send(Ok(chunk)).await.is_err() {
                                                        debug!("Receiver dropped, stopping stream");
-                                                        return;
+                                                        return accumulated_usage;
                                                    }
                                                }
                                                // Handle partial JSON for tool calls
@@ -419,11 +439,12 @@ impl AnthropicProvider {
                                                let chunk = CompletionChunk {
                                                    content: String::new(),
                                                    finished: false,
+                                                    usage: None,
                                                    tool_calls: Some(current_tool_calls.clone()),
                                                };
                                                if tx.send(Ok(chunk)).await.is_err() {
                                                    debug!("Receiver dropped, stopping stream");
-                                                    return;
+                                                    return accumulated_usage;
                                                }
                                            }
                                        }
@@ -432,12 +453,13 @@ impl AnthropicProvider {
                                            let final_chunk = CompletionChunk {
                                                content: String::new(),
                                                finished: true,
+                                                usage: accumulated_usage.clone(),
                                                tool_calls: if current_tool_calls.is_empty() { None } else { Some(current_tool_calls.clone()) },
                                            };
                                            if tx.send(Ok(final_chunk)).await.is_err() {
                                                debug!("Receiver dropped, stopping stream");
                                            }
-                                            return;
+                                            return accumulated_usage;
                                        }
                                        "error" => {
                                            if let Some(error) = event.error {
@@ -445,7 +467,7 @@ impl AnthropicProvider {
                                                let _ = tx
                                                    .send(Err(anyhow!("Anthropic API error: {:?}", error)))
                                                    .await;
-                                                return;
+                                                return accumulated_usage;
                                            }
                                        }
                                        _ => {
@@ -464,7 +486,7 @@ impl AnthropicProvider {
                Err(e) => {
                    error!("Stream error: {}", e);
                    let _ = tx.send(Err(anyhow!("Stream error: {}", e))).await;
-                    return;
+                    return accumulated_usage;
                }
            }
        }
@@ -473,9 +495,11 @@ impl AnthropicProvider {
        let final_chunk = CompletionChunk {
            content: String::new(),
            finished: true,
+            usage: accumulated_usage.clone(),
            tool_calls: if current_tool_calls.is_empty() { None } else { Some(current_tool_calls) },
        };
        let _ = tx.send(Ok(final_chunk)).await;
+        accumulated_usage
    }
 }

@@ -596,7 +620,14 @@ impl LLMProvider for AnthropicProvider {
        // Spawn task to process the stream
        let provider = self.clone();
        tokio::spawn(async move {
-            provider.parse_streaming_response(stream, tx).await;
+            let usage = provider.parse_streaming_response(stream, tx).await;
+            // Log the final usage if available
+            if let Some(usage) = usage {
+                debug!(
+                    "Stream completed with usage - prompt: {}, completion: {}, total: {}",
+                    usage.prompt_tokens, usage.completion_tokens, usage.total_tokens
+                );
+            }
        });

        Ok(ReceiverStream::new(rx))
@@ -668,14 +699,8 @@ enum AnthropicContent {

 #[derive(Debug, Deserialize)]
 struct AnthropicResponse {
-    id: String,
-    #[serde(rename = "type")]
-    response_type: String,
-    role: String,
    content: Vec<AnthropicContent>,
    model: String,
-    stop_reason: Option<String>,
-    stop_sequence: Option<String>,
    usage: AnthropicUsage,
 }

@@ -697,12 +722,18 @@ struct AnthropicStreamEvent {
    error: Option<AnthropicError>,
    #[serde(default)]
    content_block: Option<AnthropicContent>,
+    #[serde(default)]
+    message: Option<AnthropicStreamMessage>,
+}
+
+#[derive(Debug, Deserialize)]
+struct AnthropicStreamMessage {
+    #[serde(default)]
+    usage: Option<AnthropicUsage>,
 }

 #[derive(Debug, Deserialize)]
 struct AnthropicDelta {
-    #[serde(rename = "type")]
-    delta_type: Option<String>,
    text: Option<String>,
    partial_json: Option<String>,
 }
@@ -710,7 +741,9 @@ struct AnthropicDelta {
 #[derive(Debug, Deserialize)]
 struct AnthropicError {
    #[serde(rename = "type")]
+    #[allow(dead_code)]
    error_type: String,
+    #[allow(dead_code)]
    message: String,
 }

@@ -813,32 +846,4 @@ mod tests {
        assert!(anthropic_tools[0].input_schema.required.is_some());
        assert_eq!(anthropic_tools[0].input_schema.required.as_ref().unwrap()[0], "location");
    }
-
-    #[test]
-    fn test_tool_call_conversion() {
-        let provider = AnthropicProvider::new(
-            "test-key".to_string(),
-            None,
-            None,
-            None,
-        ).unwrap();
-
-        let content = vec![
-            AnthropicContent::Text {
-                text: "I'll help you get the weather.".to_string(),
-            },
-            AnthropicContent::ToolUse {
-                id: "toolu_123".to_string(),
-                name: "get_weather".to_string(),
-                input: serde_json::json!({"location": "San Francisco, CA"}),
-            },
-        ];
-
-        let tool_calls = provider.convert_anthropic_tool_calls(&content);
-
-        assert_eq!(tool_calls.len(), 1);
-        assert_eq!(tool_calls[0].id, "toolu_123");
-        assert_eq!(tool_calls[0].tool, "get_weather");
-        assert_eq!(tool_calls[0].args["location"], "San Francisco, CA");
-    }
 }
--- a/crates/g3-providers/src/databricks.rs
+++ b/crates/g3-providers/src/databricks.rs
--- a/crates/g3-core/src/providers/embedded.rs
+++ b/crates/g3-core/src/providers/embedded.rs
@@ -1,5 +1,5 @@
 use anyhow::Result;
-use g3_providers::{
+use crate::{
    CompletionChunk, CompletionRequest, CompletionResponse, CompletionStream, LLMProvider, Message,
    MessageRole, Usage,
 };
@@ -8,22 +8,18 @@ use llama_cpp::{
    LlamaModel, LlamaParams, LlamaSession, SessionParams,
 };
 use std::path::{Path, PathBuf};
-use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
 use std::sync::Arc;
-use std::time::Duration;
 use tokio::sync::mpsc;
 use tokio::sync::Mutex;
 use tokio_stream::wrappers::ReceiverStream;
-use tracing::{debug, error, info, warn};
+use tracing::{debug, error, info};

 pub struct EmbeddedProvider {
-    model: Arc<LlamaModel>,
    session: Arc<Mutex<LlamaSession>>,
    model_name: String,
    max_tokens: u32,
    temperature: f32,
    context_length: u32,
-    generation_active: Arc<AtomicBool>,
 }

 impl EmbeddedProvider {
@@ -71,8 +67,10 @@ impl EmbeddedProvider {
            .map_err(|e| anyhow::anyhow!("Failed to load model: {}", e))?;

        // Create session with parameters
-        let mut session_params = SessionParams::default();
-        session_params.n_ctx = context_size;
+        let mut session_params = SessionParams {
+            n_ctx: context_size,
+            ..Default::default()
+        };
        if let Some(threads) = threads {
            session_params.n_threads = threads;
        }
@@ -84,13 +82,11 @@ impl EmbeddedProvider {
        info!("Successfully loaded {} model", model_type);

        Ok(Self {
-            model: Arc::new(model),
            session: Arc::new(Mutex::new(session)),
            model_name: format!("embedded-{}", model_type),
            max_tokens: max_tokens.unwrap_or(2048),
            temperature: temperature.unwrap_or(0.1),
            context_length: context_size,
-            generation_active: Arc::new(AtomicBool::new(false)),
        })
    }

@@ -143,7 +139,7 @@ impl EmbeddedProvider {
                        in_conversation = false;
                    }
                    MessageRole::Assistant => {
-                        formatted.push_str(" ");
+                        formatted.push(' ');
                        formatted.push_str(&message.content);
                        formatted.push_str("</s> ");
                        in_conversation = false;
@@ -152,8 +148,8 @@ impl EmbeddedProvider {
            }
            
            // If the last message was from user, add a space for the assistant's response
-            if messages.last().map_or(false, |m| matches!(m.role, MessageRole::User)) {
-                formatted.push_str(" ");
+            if messages.last().is_some_and(|m| matches!(m.role, MessageRole::User)) {
+                formatted.push(' ');
            }
            
            formatted
@@ -429,7 +425,6 @@ impl EmbeddedProvider {
    // Download the Qwen 2.5 7B model if it doesn't exist
    fn download_qwen_model(model_path: &Path) -> Result<()> {
        use std::fs;
-        use std::io::Write;
        use std::process::Command;
        
        const MODEL_URL: &str = "https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-GGUF/resolve/main/qwen2.5-7b-instruct-q3_k_m.gguf";
@@ -446,7 +441,7 @@ impl EmbeddedProvider {
        
        // Use curl with progress bar for download
        let output = Command::new("curl")
-            .args(&[
+            .args([
                "-L",  // Follow redirects
                "-#",  // Show progress bar
                "-f",  // Fail on HTTP errors
@@ -662,6 +657,7 @@ impl LLMProvider for EmbeddedProvider {
                                    let chunk = CompletionChunk {
                                        content: remaining_to_send.to_string(),
                                        finished: false,
+                                        usage: None,
                                        tool_calls: None,
                                    };
                                    let _ = tx.blocking_send(Ok(chunk));
@@ -688,6 +684,7 @@ impl LLMProvider for EmbeddedProvider {
                            let chunk = CompletionChunk {
                                content: remaining_to_send.to_string(),
                                finished: false,
+                                usage: None,
                                tool_calls: None,
                            };
                            let _ = tx.blocking_send(Ok(chunk));
@@ -721,6 +718,7 @@ impl LLMProvider for EmbeddedProvider {
                            let chunk = CompletionChunk {
                                content: to_send.to_string(),
                                finished: false,
+                                usage: None,
                                tool_calls: None,
                            };
                            if tx.blocking_send(Ok(chunk)).is_err() {
@@ -736,6 +734,7 @@ impl LLMProvider for EmbeddedProvider {
                        let chunk = CompletionChunk {
                            content: unsent_tokens.clone(),
                            finished: false,
+                            usage: None,
                            tool_calls: None,
                        };
                        if tx.blocking_send(Ok(chunk)).is_err() {
@@ -756,6 +755,7 @@ impl LLMProvider for EmbeddedProvider {
            let final_chunk = CompletionChunk {
                content: String::new(),
                finished: true,
+                usage: None,  // Embedded models calculate usage differently
                tool_calls: None,
            };
            let _ = tx.blocking_send(Ok(final_chunk));
--- a/crates/g3-providers/src/lib.rs
+++ b/crates/g3-providers/src/lib.rs
@@ -67,6 +67,7 @@ pub struct CompletionChunk {
    pub content: String,
    pub finished: bool,
    pub tool_calls: Option<Vec<ToolCall>>,
+    pub usage: Option<Usage>,  // Add usage tracking for streaming
 }

 #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -84,8 +85,13 @@ pub struct Tool {
 }

 pub mod anthropic;
+pub mod databricks;
+pub mod embedded;
+pub mod oauth;

 pub use anthropic::AnthropicProvider;
+pub use databricks::DatabricksProvider;
+pub use embedded::EmbeddedProvider;

 /// Provider registry for managing multiple LLM providers
 pub struct ProviderRegistry {
--- a/crates/g3-providers/src/oauth.rs
+++ b/crates/g3-providers/src/oauth.rs
@@ -0,0 +1,463 @@
+use anyhow::Result;
+use axum::{extract::Query, response::Html, routing::get, Router};
+use base64::Engine;
+use chrono::{DateTime, Utc};
+use serde::{Deserialize, Serialize};
+use serde_json::Value;
+use sha2::Digest;
+use std::{collections::HashMap, fs, net::SocketAddr, path::PathBuf, sync::Arc};
+use tokio::sync::{oneshot, Mutex as TokioMutex};
+use url::Url;
+
+#[derive(Debug, Clone)]
+struct OidcEndpoints {
+    authorization_endpoint: String,
+    token_endpoint: String,
+}
+
+#[derive(Serialize, Deserialize)]
+struct TokenData {
+    /// The access token used to authenticate API requests
+    access_token: String,
+
+    /// Optional refresh token that can be used to obtain a new access token
+    /// when the current one expires, enabling offline access without user interaction
+    refresh_token: Option<String>,
+
+    /// When the access token expires (if known)
+    /// Used to determine when a token needs to be refreshed
+    expires_at: Option<DateTime<Utc>>,
+}
+
+struct TokenCache {
+    cache_path: PathBuf,
+}
+
+fn get_base_path() -> PathBuf {
+    // Use a similar pattern to Goose but for g3
+    // macOS/Linux: ~/.config/g3/databricks/oauth
+    // Windows: ~\AppData\Roaming\g3\config\databricks\oauth\
+    let mut path = dirs::config_dir().unwrap_or_else(|| PathBuf::from("."));
+    path.push("g3");
+    path.push("databricks");
+    path.push("oauth");
+    path
+}
+
+impl TokenCache {
+    fn new(host: &str, client_id: &str, scopes: &[String]) -> Self {
+        let mut hasher = sha2::Sha256::new();
+        hasher.update(host.as_bytes());
+        hasher.update(client_id.as_bytes());
+        hasher.update(scopes.join(",").as_bytes());
+        let hash = format!("{:x}", hasher.finalize());
+
+        fs::create_dir_all(get_base_path()).unwrap_or(());
+        let cache_path = get_base_path().join(format!("{}.json", hash));
+
+        Self { cache_path }
+    }
+
+    fn load_token(&self) -> Option<TokenData> {
+        if let Ok(contents) = fs::read_to_string(&self.cache_path) {
+            if let Ok(token_data) = serde_json::from_str::<TokenData>(&contents) {
+                // Only return tokens that have a refresh token
+                if token_data.refresh_token.is_some() {
+                    // If token is not expired, return it for immediate use
+                    if let Some(expires_at) = token_data.expires_at {
+                        if expires_at > Utc::now() {
+                            return Some(token_data);
+                        }
+                        // If token is expired but has refresh token, return it so we can refresh
+                        return Some(token_data);
+                    }
+                    // No expiration time but has refresh token, return it
+                    return Some(token_data);
+                }
+                // Token doesn't have a refresh token, ignore it to force a new OAuth flow
+            }
+        }
+        None
+    }
+
+    fn save_token(&self, token_data: &TokenData) -> Result<()> {
+        if let Some(parent) = self.cache_path.parent() {
+            fs::create_dir_all(parent)?;
+        }
+        let contents = serde_json::to_string(token_data)?;
+        fs::write(&self.cache_path, contents)?;
+        Ok(())
+    }
+}
+
+async fn get_workspace_endpoints(host: &str) -> Result<OidcEndpoints> {
+    let base_url = Url::parse(host).expect("Invalid host URL");
+    let oidc_url = base_url
+        .join("oidc/.well-known/oauth-authorization-server")
+        .expect("Invalid OIDC URL");
+
+    let client = reqwest::Client::new();
+    let resp = client.get(oidc_url.clone()).send().await?;
+
+    if !resp.status().is_success() {
+        return Err(anyhow::anyhow!(
+            "Failed to get OIDC configuration from {}",
+            oidc_url.to_string()
+        ));
+    }
+
+    let oidc_config: Value = resp.json().await?;
+
+    let authorization_endpoint = oidc_config
+        .get("authorization_endpoint")
+        .and_then(|v| v.as_str())
+        .ok_or_else(|| anyhow::anyhow!("authorization_endpoint not found in OIDC configuration"))?
+        .to_string();
+
+    let token_endpoint = oidc_config
+        .get("token_endpoint")
+        .and_then(|v| v.as_str())
+        .ok_or_else(|| anyhow::anyhow!("token_endpoint not found in OIDC configuration"))?
+        .to_string();
+
+    Ok(OidcEndpoints {
+        authorization_endpoint,
+        token_endpoint,
+    })
+}
+
+struct OAuthFlow {
+    endpoints: OidcEndpoints,
+    client_id: String,
+    redirect_url: String,
+    scopes: Vec<String>,
+    state: String,
+    verifier: String,
+}
+
+impl OAuthFlow {
+    fn new(
+        endpoints: OidcEndpoints,
+        client_id: String,
+        redirect_url: String,
+        scopes: Vec<String>,
+    ) -> Self {
+        Self {
+            endpoints,
+            client_id,
+            redirect_url,
+            scopes,
+            state: nanoid::nanoid!(16),
+            verifier: nanoid::nanoid!(64),
+        }
+    }
+
+    /// Extracts token data from an OAuth 2.0 token response.
+    fn extract_token_data(
+        &self,
+        token_response: &Value,
+        old_refresh_token: Option<&str>,
+    ) -> Result<TokenData> {
+        // Extract access token (required)
+        let access_token = token_response
+            .get("access_token")
+            .and_then(|v| v.as_str())
+            .ok_or_else(|| anyhow::anyhow!("access_token not found in token response"))?
+            .to_string();
+
+        // Extract refresh token if available
+        let refresh_token = token_response
+            .get("refresh_token")
+            .and_then(|v| v.as_str())
+            .map(|s| s.to_string())
+            .or_else(|| old_refresh_token.map(|s| s.to_string()));
+
+        // Handle token expiration
+        let expires_at =
+            if let Some(expires_in) = token_response.get("expires_in").and_then(|v| v.as_u64()) {
+                // Traditional OAuth flow with expires_in seconds
+                Some(Utc::now() + chrono::Duration::seconds(expires_in as i64))
+            } else {
+                // If the server doesn't provide any expiration info, log it but don't set an expiration
+                tracing::debug!(
+                    "No expiration information provided by server, token expiration unknown."
+                );
+                None
+            };
+
+        Ok(TokenData {
+            access_token,
+            refresh_token,
+            expires_at,
+        })
+    }
+
+    fn get_authorization_url(&self) -> String {
+        let challenge = {
+            let digest = sha2::Sha256::digest(self.verifier.as_bytes());
+            base64::engine::general_purpose::URL_SAFE_NO_PAD.encode(digest)
+        };
+
+        let params = [
+            ("response_type", "code"),
+            ("client_id", &self.client_id),
+            ("redirect_uri", &self.redirect_url),
+            ("scope", &self.scopes.join(" ")),
+            ("state", &self.state),
+            ("code_challenge", &challenge),
+            ("code_challenge_method", "S256"),
+        ];
+
+        format!(
+            "{}?{}",
+            self.endpoints.authorization_endpoint,
+            serde_urlencoded::to_string(params).unwrap()
+        )
+    }
+
+    async fn exchange_code_for_token(&self, code: &str) -> Result<TokenData> {
+        let params = [
+            ("grant_type", "authorization_code"),
+            ("code", code),
+            ("redirect_uri", &self.redirect_url),
+            ("code_verifier", &self.verifier),
+            ("client_id", &self.client_id),
+        ];
+
+        let client = reqwest::Client::new();
+        let resp = client
+            .post(&self.endpoints.token_endpoint)
+            .header("Content-Type", "application/x-www-form-urlencoded")
+            .form(&params)
+            .send()
+            .await?;
+
+        if !resp.status().is_success() {
+            let err_text = resp.text().await?;
+            return Err(anyhow::anyhow!(
+                "Failed to exchange code for token: {}",
+                err_text
+            ));
+        }
+
+        let token_response: Value = resp.json().await?;
+        self.extract_token_data(&token_response, None)
+    }
+
+    async fn refresh_token(&self, refresh_token: &str) -> Result<TokenData> {
+        let params = [
+            ("grant_type", "refresh_token"),
+            ("refresh_token", refresh_token),
+            ("client_id", &self.client_id),
+        ];
+
+        tracing::debug!("Refreshing token using refresh_token");
+
+        let client = reqwest::Client::new();
+        let resp = client
+            .post(&self.endpoints.token_endpoint)
+            .header("Content-Type", "application/x-www-form-urlencoded")
+            .form(&params)
+            .send()
+            .await?;
+
+        if !resp.status().is_success() {
+            let err_text = resp.text().await?;
+            return Err(anyhow::anyhow!("Failed to refresh token: {}", err_text));
+        }
+
+        let token_response: Value = resp.json().await?;
+        self.extract_token_data(&token_response, Some(refresh_token))
+    }
+
+    async fn execute(&self) -> Result<TokenData> {
+        // Create a channel that will send the auth code from the app process
+        let (tx, rx) = oneshot::channel();
+        let state = self.state.clone();
+        let tx = Arc::new(TokioMutex::new(Some(tx)));
+
+        // Setup a server that will receive the redirect, capture the code, and display success/failure
+        let app = Router::new().route(
+            "/",
+            get(move |Query(params): Query<HashMap<String, String>>| {
+                let tx = Arc::clone(&tx);
+                let state = state.clone();
+                async move {
+                    let code = params.get("code").cloned();
+                    let received_state = params.get("state").cloned();
+
+                    if let (Some(code), Some(received_state)) = (code, received_state) {
+                        if received_state == state {
+                            if let Some(sender) = tx.lock().await.take() {
+                                if sender.send(code).is_ok() {
+                                    return Html(
+                                        "<h2>G3 Authentication Success</h2><p>You can close this window and return to your terminal.</p>",
+                                    );
+                                }
+                            }
+                            Html("<h2>Error</h2><p>Authentication already completed.</p>")
+                        } else {
+                            Html("<h2>Error</h2><p>State mismatch.</p>")
+                        }
+                    } else {
+                        Html("<h2>Error</h2><p>Authentication failed.</p>")
+                    }
+                }
+            }),
+        );
+
+        // Start the server to accept the oauth code
+        let redirect_url = Url::parse(&self.redirect_url)?;
+        let port = redirect_url.port().unwrap_or(80);
+        let addr = SocketAddr::from(([127, 0, 0, 1], port));
+
+        let listener = tokio::net::TcpListener::bind(addr).await?;
+
+        let server_handle = tokio::spawn(async move {
+            let server = axum::serve(listener, app);
+            server.await.unwrap();
+        });
+
+        // Open the browser which will redirect with the code to the server
+        let authorization_url = self.get_authorization_url();
+        if std::env::var("G3_RETRO_MODE").is_err() {
+            println!("🔐 Opening browser for Databricks authentication...");
+        }
+        if webbrowser::open(&authorization_url).is_err() {
+            println!(
+                "Please open this URL in your browser:\n{}",
+                authorization_url
+            );
+        }
+
+        // Wait for the authorization code with a timeout
+        let code = tokio::time::timeout(
+            std::time::Duration::from_secs(120), // 2 minute timeout
+            rx,
+        )
+        .await
+        .map_err(|_| anyhow::anyhow!("Authentication timed out after 2 minutes"))??;
+
+        // Stop the server
+        server_handle.abort();
+
+        if std::env::var("G3_RETRO_MODE").is_err() {
+            println!("✅ Authentication successful! Exchanging code for token...");
+        }
+
+        // Exchange the code for a token
+        self.exchange_code_for_token(&code).await
+    }
+}
+
+pub async fn get_oauth_token_async(
+    host: &str,
+    client_id: &str,
+    redirect_url: &str,
+    scopes: &[String],
+) -> Result<String> {
+    let token_cache = TokenCache::new(host, client_id, scopes);
+
+    // Try cache first
+    if let Some(token) = token_cache.load_token() {
+        // If token has an expiration time, check if it's expired
+        if let Some(expires_at) = token.expires_at {
+            if expires_at > Utc::now() {
+                tracing::debug!("Using cached token");
+                return Ok(token.access_token);
+            }
+            // Token is expired, will try to refresh below
+            tracing::debug!("Token is expired, attempting to refresh");
+        } else {
+            // No expiration time was provided by the server
+            tracing::debug!("Token has no expiration time, using cached token");
+            return Ok(token.access_token);
+        }
+
+        // Token is expired or has no expiration, try to refresh if we have a refresh token
+        if let Some(refresh_token) = token.refresh_token {
+            // Get endpoints for token refresh
+            match get_workspace_endpoints(host).await {
+                Ok(endpoints) => {
+                    let flow = OAuthFlow::new(
+                        endpoints,
+                        client_id.to_string(),
+                        redirect_url.to_string(),
+                        scopes.to_vec(),
+                    );
+
+                    // Try to refresh the token
+                    match flow.refresh_token(&refresh_token).await {
+                        Ok(new_token) => {
+                            if let Err(e) = token_cache.save_token(&new_token) {
+                                tracing::warn!("Failed to save refreshed token: {}", e);
+                            }
+                            tracing::info!("Successfully refreshed token");
+                            return Ok(new_token.access_token);
+                        }
+                        Err(e) => {
+                            tracing::warn!(
+                                "Failed to refresh token, will try new auth flow: {}",
+                                e
+                            );
+                            // Continue to new auth flow
+                        }
+                    }
+                }
+                Err(e) => {
+                    tracing::warn!("Failed to get endpoints for token refresh: {}", e);
+                    // Continue to new auth flow
+                }
+            }
+        }
+    }
+
+    // Get endpoints and execute flow for a new token
+    let endpoints = get_workspace_endpoints(host).await?;
+    let flow = OAuthFlow::new(
+        endpoints,
+        client_id.to_string(),
+        redirect_url.to_string(),
+        scopes.to_vec(),
+    );
+
+    // Execute the OAuth flow and get token
+    let token = flow.execute().await?;
+
+    // Cache and return
+    token_cache.save_token(&token)?;
+    if std::env::var("G3_RETRO_MODE").is_err() {
+        println!("🎉 Databricks authentication complete!");
+    }
+    Ok(token.access_token)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_token_cache() -> Result<()> {
+        let cache = TokenCache::new(
+            "https://example.com",
+            "test-client",
+            &["scope1".to_string()],
+        );
+
+        // Test with expiration time
+        let token_data = TokenData {
+            access_token: "test-token".to_string(),
+            refresh_token: Some("test-refresh-token".to_string()),
+            expires_at: Some(Utc::now() + chrono::Duration::hours(1)),
+        };
+
+        cache.save_token(&token_data)?;
+
+        let loaded_token = cache.load_token().unwrap();
+        assert_eq!(loaded_token.access_token, token_data.access_token);
+        assert_eq!(loaded_token.refresh_token, token_data.refresh_token);
+        assert!(loaded_token.expires_at.is_some());
+
+        Ok(())
+    }
+}
--- a/test-ai-requirements.sh
+++ b/test-ai-requirements.sh
@@ -0,0 +1,39 @@
+#!/bin/bash
+# Test script for AI-enhanced interactive requirements mode
+
+echo "Testing AI-enhanced interactive requirements mode..."
+echo ""
+
+# Create a test workspace
+TEST_WORKSPACE="/tmp/g3-test-interactive-$(date +%s)"
+mkdir -p "$TEST_WORKSPACE"
+
+echo "Test workspace: $TEST_WORKSPACE"
+echo ""
+
+# Create sample brief input
+BRIEF_INPUT="build a calculator cli in rust with basic operations"
+
+echo "Brief input:"
+echo "---"
+echo "$BRIEF_INPUT"
+echo "---"
+echo ""
+
+echo "This will:"
+echo "1. Send brief input to AI"
+echo "2. AI generates structured requirements.md"
+echo "3. Show enhanced requirements"
+echo "4. Prompt for confirmation (y/e/n)"
+echo ""
+
+echo "To test manually, run:"
+echo "cargo run -- --autonomous --interactive-requirements --workspace $TEST_WORKSPACE"
+echo ""
+echo "Then type: $BRIEF_INPUT"
+echo "Press Ctrl+D"
+echo "Review the AI-generated requirements"
+echo "Choose 'y' to proceed, 'e' to edit, or 'n' to cancel"
+echo ""
+
+echo "Test workspace will be at: $TEST_WORKSPACE"
Author	SHA1	Message	Date
Michael Neale	f2ed303550	Revert "don't need this" This reverts commit `93121c18e0`.	2025-10-22 14:53:25 +11:00
Michael Neale	93121c18e0	don't need this	2025-10-22 14:30:13 +11:00
Michael Neale	ed84a940f9	tweak auto mode	2025-10-22 14:27:17 +11:00
Michael Neale	3128b5d8b9	can choose per mode models for auto mode	2025-10-22 14:19:00 +11:00
Dhanji Prasanna	758e255af8	dont run safaridriver --enable each time	2025-10-21 16:00:58 +11:00
Dhanji Prasanna	393826ae02	webdriver tools	2025-10-21 14:34:41 +11:00
Dhanji Prasanna	3afad3d61f	progressive context thinning	2025-10-20 15:29:44 +11:00
Dhanji Prasanna	2488cc54d5	docs: update README and DESIGN to reflect current project state - Add g3-computer-control crate to architecture documentation - Document all 13 tools including computer control and TODO management - Add context thinning feature documentation (50-80% thresholds) - Update tool ecosystem section with complete tool list - Remove broken link to non-existent COMPUTER_CONTROL.md - Update workspace count from 5 to 6 crates - Add platform-specific implementation details for computer control - Document OCR support via Tesseract - Clarify setup instructions for computer control features	2025-10-20 15:03:22 +11:00
Dhanji Prasanna	2ad0c9a3fd	todo list formatting	2025-10-20 14:27:53 +11:00
Dhanji Prasanna	2008a81193	fix to pass feedback to player (broken by todo system)	2025-10-20 14:12:08 +11:00
Dhanji Prasanna	776f5034b8	TODO tools	2025-10-20 10:50:53 +11:00
Dhanji Prasanna	92bece957b	colorizing tool calls	2025-10-18 16:09:30 +11:00
Dhanji Prasanna	767299ff4e	minor	2025-10-18 16:03:58 +11:00
Dhanji Prasanna	9d35449be8	~ expansion for read_file and str_replace	2025-10-18 16:01:15 +11:00
Dhanji Prasanna	da652bf287	computer control tools	2025-10-18 14:16:50 +11:00
Dhanji Prasanna	a566171203	small turn completing bug	2025-10-18 13:25:23 +11:00
Dhanji Prasanna	347c9e1e00	colorize timing based on duration	2025-10-17 13:54:21 +11:00
Dhanji Prasanna	aa7eda0331	fix wall clock timing	2025-10-17 10:36:21 +11:00
Dhanji Prasanna	e42c76f3b9	Tune coach pickiness down	2025-10-17 10:28:08 +11:00
Dhanji Prasanna	dd211fab1c	panic fix	2025-10-17 09:50:01 +11:00
Dhanji R. Prasanna	bcece38473	Merge pull request #5 from dhanji/micn/agent-tweaks load AGENTS.md if there	2025-10-16 15:06:14 +11:00
Michael Neale	3ff8413538	loading agents	2025-10-16 15:03:23 +11:00
Michael Neale	de2a761dbd	Merge branch 'main' into micn/agent-tweaks	2025-10-16 14:49:16 +11:00
Dhanji Prasanna	e5a6ab66d7	turn histogram from autonomous mode	2025-10-16 14:35:47 +11:00
Dhanji Prasanna	444c0bc6c6	--quiet flag suppresses logs	2025-10-16 13:08:26 +11:00
Michael Neale	758a6b18c8	load agents if there	2025-10-16 12:00:50 +11:00
Dhanji Prasanna	41c1363fb5	guard case to ensure approval terminates run	2025-10-16 11:01:46 +11:00
Dhanji Prasanna	52ada78151	requirements flag	2025-10-16 10:08:04 +11:00
Dhanji Prasanna	662748ed23	better formatting cli	2025-10-15 22:04:39 +11:00
Dhanji Prasanna	beccc8fa15	reset filter suppression state between tool calls (still broken)	2025-10-15 21:15:24 +11:00
Dhanji Prasanna	c9037ede22	fixed feedback handoff in autonomous mode	2025-10-15 14:07:25 +11:00
Dhanji Prasanna	793fc544c0	some cleanup	2025-10-15 11:12:26 +11:00
Dhanji Prasanna	fb64b7fe32	fixed filtering and tool call timeouts	2025-10-15 10:18:20 +11:00
Dhanji Prasanna	befc55152d	fixed tool call cli output	2025-10-15 09:55:59 +11:00
Dhanji Prasanna	bb90cc7826	some fixes	2025-10-14 12:44:02 +11:00
Dhanji Prasanna	5110da0c61	design doc	2025-10-14 12:33:36 +11:00
Dhanji Prasanna	bfd256db3b	fix tool output	2025-10-14 12:21:22 +11:00
Dhanji R. Prasanna	cef4d12d36	small cleanup to shell	2025-10-13 21:52:47 +11:00
Dhanji R. Prasanna	45eb0a4b63	small compile error	2025-10-13 21:51:44 +11:00
Dhanji R. Prasanna	a914afedd8	panic fix in tui	2025-10-13 21:49:22 +11:00
Dhanji Prasanna	627fdcd9bf	streaming tool call attempt 1	2025-10-13 20:25:12 +11:00
Dhanji Prasanna	b43b693b60	small tweak to tmp prompting	2025-10-13 13:38:34 +11:00
Dhanji Prasanna	062e6de63f	fix for buffered messages at end, colorized context bars	2025-10-13 13:36:37 +11:00
Dhanji Prasanna	318355e864	Added --provider and --model flags	2025-10-12 17:05:58 +11:00
Dhanji Prasanna	037bff7021	UTF-8 decoding bug	2025-10-12 14:54:28 +11:00
Dhanji Prasanna	05c21b61df	coach mode feedback fix	2025-10-11 16:13:39 +11:00
Dhanji Prasanna	f42e43a0d6	auto mode report	2025-10-11 15:11:07 +11:00
Dhanji Prasanna	658a335615	prompt change	2025-10-11 15:07:47 +11:00
Dhanji Prasanna	e89e1acf41	auto mode and message fix	2025-10-11 15:06:37 +11:00
Dhanji Prasanna	7dd4fbf9b6	restart turn on error	2025-10-11 13:47:13 +11:00
Dhanji Prasanna	5fb631d5c3	cosmetic fixes to tool call headers	2025-10-11 13:32:35 +11:00
Dhanji Prasanna	13236a1be5	ui writer fixes	2025-10-10 15:39:42 +11:00
Dhanji Prasanna	1bae19abd4	Revert "fix for tool args and missing msgs" This reverts commit `1e9ff972d9`.	2025-10-10 15:39:42 +11:00
Dhanji Prasanna	d16a694862	show messages fix	2025-10-10 15:36:57 +11:00
Dhanji Prasanna	4a819e8f27	context window counting bug	2025-10-10 14:40:10 +11:00
Dhanji Prasanna	1e9ff972d9	fix for tool args and missing msgs	2025-10-10 14:28:02 +11:00
Dhanji Prasanna	57b7bcb0de	cosmetic tool call stuff	2025-10-10 14:18:35 +11:00
Dhanji Prasanna	426a9b88a9	readme tweaks	2025-10-10 14:08:37 +11:00
Dhanji Prasanna	2d959b3d63	cosmetic	2025-10-10 13:52:04 +11:00
Dhanji Prasanna	16216532d0	newline	2025-10-10 13:46:08 +11:00
Dhanji Prasanna	3ef7ec0d9f	colorize	2025-10-10 13:38:38 +11:00
Dhanji Prasanna	0ad52a2eb2	tighten tool output in normal cli	2025-10-10 10:03:15 +11:00
Dhanji Prasanna	1e44971cf8	error recovery and tests	2025-10-10 09:35:03 +11:00
Dhanji Prasanna	ef01226ee1	auto readme	2025-10-09 14:56:25 +11:00
Dhanji Prasanna	260c949576	token counting fixes	2025-10-09 12:11:21 +11:00
Dhanji Prasanna	9d1eef82b9	final output fix for auto mode	2025-10-09 11:16:21 +11:00
Dhanji Prasanna	cd489fb235	partial readfile support	2025-10-09 11:08:02 +11:00
Dhanji Prasanna	0973b83d3a	fix build warnings	2025-10-08 14:06:25 +11:00
Dhanji Prasanna	5e6ac4e5f5	tweak to colors	2025-10-08 13:43:29 +11:00
Dhanji Prasanna	e1b1ed560a	dracula theme tweaks	2025-10-08 12:39:14 +11:00
Dhanji Prasanna	8e4d0a3975	dracula theme tweak	2025-10-08 11:19:00 +11:00
Dhanji Prasanna	b369a1f5c3	fixes for coach mode	2025-10-08 11:17:24 +11:00
Dhanji Prasanna	e11a287acc	color schemes	2025-10-08 11:14:56 +11:00
Dhanji Prasanna	ed769bd58a	some graphing updates	2025-10-07 15:13:45 +11:00
Dhanji Prasanna	e6cec5ef0f	retry on errors	2025-10-07 11:20:19 +11:00
Dhanji Prasanna	5a83e1b7e0	input box fixes	2025-10-06 14:48:27 +11:00
Dhanji Prasanna	c9487db5e7	only show tool detail when running	2025-10-06 14:33:15 +11:00
Dhanji Prasanna	340ba78eb3	remove output box border	2025-10-06 14:23:17 +11:00
Dhanji Prasanna	4a25191c77	bug fix on end of agent turn	2025-10-06 13:25:19 +11:00
Dhanji Prasanna	bcba99ec6c	auto refresh token	2025-10-04 17:32:48 +10:00
Dhanji Prasanna	1a57dd3b1d	tool window scrolling	2025-10-04 16:34:59 +10:00
Dhanji Prasanna	1379af7159	tool headers working	2025-10-04 16:24:33 +10:00
Dhanji Prasanna	9b7c228134	scroll hack	2025-10-04 15:05:06 +10:00
Dhanji Prasanna	f562301aa2	tweaks to newline	2025-10-04 13:30:11 +10:00
Dhanji Prasanna	cdfca615e3	tail cursor	2025-10-03 14:23:56 +10:00
Dhanji Prasanna	54e2a66b7d	fixed newline character messup	2025-10-03 14:04:17 +10:00
Dhanji Prasanna	dfa54f20ec	tmp file directive	2025-10-03 13:09:02 +10:00
Dhanji Prasanna	213dfd28d4	colors	2025-10-03 11:05:01 +10:00
Dhanji Prasanna	b39fd02603	scrolling fixed	2025-10-03 11:01:39 +10:00
Dhanji Prasanna	56e13ced64	tool calling boxes	2025-10-02 15:50:04 +10:00
Dhanji Prasanna	4e457960ed	processing blink	2025-10-02 15:38:27 +10:00
Dhanji Prasanna	1faf16b23a	tweaks	2025-10-02 15:34:37 +10:00
Dhanji Prasanna	4de994a2a7	tweak	2025-10-02 15:23:15 +10:00
Dhanji Prasanna	dd89067ac1	minor	2025-10-02 15:18:56 +10:00
Dhanji Prasanna	c065532c41	softer colors	2025-10-02 15:15:21 +10:00
Dhanji Prasanna	7ce1bfc8e2	tweaks to ui	2025-10-02 15:03:23 +10:00
Dhanji Prasanna	cd7f8d3fc7	model only	2025-10-02 14:58:03 +10:00
Dhanji Prasanna	bf5efde06e	show model and provider	2025-10-02 14:53:38 +10:00
Dhanji Prasanna	57b1b51e65	retro mode ui!	2025-10-02 14:47:19 +10:00
Dhanji Prasanna	a87f81042a	remove edit_file	2025-10-02 13:58:57 +10:00
Dhanji Prasanna	8c7dd146f8	UI writer abstraction instead of printlns everywhere	2025-10-02 11:06:14 +10:00
Dhanji Prasanna	e324ddd99d	hopefully a bit better tool call detection	2025-10-02 10:27:58 +10:00
Dhanji Prasanna	9638f40cfb	some autonomous mode fixes	2025-10-02 09:45:18 +10:00
Dhanji Prasanna	98cf72c12a	suppress printout of final_output	2025-10-01 15:26:55 +10:00
Dhanji Prasanna	046b54c49b	move embedded provider to a better crate	2025-10-01 15:19:37 +10:00
Dhanji Prasanna	b9679e14dc	force update of head	2025-10-01 14:12:23 +10:00
Dhanji Prasanna	a843ecc9d0	suppress json tool calls in raw text	2025-10-01 13:20:13 +10:00
Dhanji Prasanna	3349a33106	dracula theme	2025-10-01 11:24:30 +10:00
Dhanji Prasanna	1621d081ec	tui lib for nicer cli	2025-10-01 11:19:34 +10:00
Dhanji Prasanna	5f642061de	error handling in autonomous mode	2025-10-01 11:01:23 +10:00
Dhanji Prasanna	f0ddfdc3d2	move logs into subdir	2025-09-30 22:29:49 +10:00
Dhanji Prasanna	92318ff51c	str_replace fixes	2025-09-30 22:24:54 +10:00
Dhanji Prasanna	03229effba	increase max iterations to 400	2025-09-30 21:35:42 +10:00
Dhanji Prasanna	f99c61331c	str_replace instead of edit_file much better	2025-09-30 21:15:28 +10:00
Dhanji Prasanna	b3c2c0ad30	edit file fixes	2025-09-30 20:41:14 +10:00
Dhanji Prasanna	3c4da6f974	update readme	2025-09-30 14:00:04 +10:00
Dhanji Prasanna	270cbae1e6	edit_file	2025-09-30 13:50:02 +10:00
Dhanji Prasanna	69fc3e90dc	max_tokens fix	2025-09-29 11:05:57 +10:00
Dhanji Prasanna	ce273ba3fb	multiline input with \	2025-09-29 10:23:41 +10:00
Dhanji Prasanna	c4ee4a6cde	basic project model	2025-09-29 09:23:27 +10:00
Dhanji Prasanna	315596e316	only emit final response once	2025-09-29 08:22:26 +10:00
Dhanji Prasanna	39ef13e317	fix a looping error in iterations	2025-09-29 06:54:55 +10:00
Dhanji Prasanna	4e64555008	max tokens fix for databricks	2025-09-29 06:45:53 +10:00
Dhanji Prasanna	f3cf9b688e	tool call cosmetic cleanup	2025-09-27 20:40:06 +10:00
Dhanji Prasanna	e2354b0679	allow more iterations per turn	2025-09-27 20:34:39 +10:00
Dhanji Prasanna	c490228824	databricks support	2025-09-27 17:28:02 +10:00
Dhanji Prasanna	258eb4fd54	minor	2025-09-27 15:49:55 +10:00
Dhanji Prasanna	091b824b1e	more debug logging	2025-09-27 15:43:33 +10:00
Dhanji Prasanna	2b561516b6	cleanup	2025-09-27 15:16:42 +10:00
Dhanji Prasanna	1046b30138	print report at end	2025-09-27 15:01:59 +10:00
Dhanji Prasanna	7fbfec50d8	working much simpler	2025-09-27 14:46:53 +10:00
Dhanji Prasanna	3c74cd410e	remove subtasks	2025-09-27 14:39:08 +10:00
Dhanji Prasanna	811c642b17	suppress json text tool calls a bit jankily	2025-09-27 14:34:54 +10:00
Dhanji Prasanna	016ee80554	cap total subtasks	2025-09-27 14:28:13 +10:00
Dhanji Prasanna	622de9d540	go straight to coach turn if files exist	2025-09-27 14:19:54 +10:00
Dhanji Prasanna	e82821189b	write/read file support	2025-09-27 13:43:09 +10:00
Dhanji Prasanna	7595ee083e	logging optimization	2025-09-27 12:18:27 +10:00
Dhanji Prasanna	fb114cfcf5	imrpv	2025-09-27 06:29:33 +10:00
Dhanji Prasanna	e97614df76	better output	2025-09-26 22:37:30 +10:00
Dhanji Prasanna	58052fd0fe	autonomous mode	2025-09-26 22:34:47 +10:00
Dhanji Prasanna	6ec596ae4d	minor	2025-09-26 21:55:15 +10:00
Dhanji Prasanna	5ef4a74468	minor	2025-09-26 21:38:01 +10:00
Dhanji Prasanna	dd20e0bb01	some cleanup of converstation mgmt	2025-09-22 20:38:44 +10:00