9.9 KiB
G3 - AI Coding Agent
G3 is a coding AI agent designed to help you complete tasks by writing code and executing commands. Built in Rust, it provides a flexible architecture for interacting with various Large Language Model (LLM) providers while offering powerful code generation and task automation capabilities.
Architecture Overview
G3 follows a modular architecture organized as a Rust workspace with multiple crates, each responsible for specific functionality:
Core Components
g3-core
The heart of the agent system, containing:
- Agent Engine: Main orchestration logic for handling conversations, tool execution, and task management
- Context Window Management: Intelligent tracking of token usage with context thinning (50-80%) and auto-summarization at 80% capacity
- Tool System: Built-in tools for file operations, shell commands, computer control, TODO management, and structured output
- Streaming Response Parser: Real-time parsing of LLM responses with tool call detection and execution
- Task Execution: Support for single and iterative task execution with automatic retry logic
g3-providers
Abstraction layer for LLM providers:
- Provider Interface: Common trait-based API for different LLM backends
- Multiple Provider Support:
- Anthropic (Claude models)
- Databricks (DBRX and other models)
- Local/embedded models via llama.cpp with Metal acceleration on macOS
- OAuth Authentication: Built-in OAuth flow support for secure provider authentication
- Provider Registry: Dynamic provider management and selection
g3-config
Configuration management system:
- Environment-based configuration
- Provider credentials and settings
- Model selection and parameters
- Runtime configuration options
g3-execution
Task execution framework:
- Task planning and decomposition
- Execution strategies (sequential, parallel)
- Error handling and retry mechanisms
- Progress tracking and reporting
g3-computer-control
Computer control capabilities:
- Mouse and keyboard automation
- UI element inspection and interaction
- Screenshot capture and window management
- OCR text extraction via Tesseract
g3-cli
Command-line interface:
- Interactive terminal interface
- Task submission and monitoring
- Configuration management commands
- Session management
Error Handling & Resilience
G3 includes robust error handling with automatic retry logic:
- Recoverable Error Detection: Automatically identifies recoverable errors (rate limits, network issues, server errors, timeouts)
- Exponential Backoff with Jitter: Implements intelligent retry delays to avoid overwhelming services
- Detailed Error Logging: Captures comprehensive error context including stack traces, request/response data, and session information
- Error Persistence: Saves detailed error logs to
logs/errors/for post-mortem analysis - Graceful Degradation: Non-recoverable errors are logged with full context before terminating
Key Features
Intelligent Context Management
- Automatic context window monitoring with percentage-based tracking
- Smart auto-summarization when approaching token limits
- Context thinning at 50%, 60%, 70%, 80% thresholds - automatically replaces large tool results with file references
- Conversation history preservation through summaries
- Dynamic token allocation for different providers (4k to 200k+ tokens)
Interactive Control Commands
G3's interactive CLI includes control commands for manual context management:
/compact: Manually trigger summarization to compact conversation history/thinnify: Manually trigger context thinning to replace large tool results with file references/readme: Reload README.md and AGENTS.md from disk without restarting/stats: Show detailed context and performance statistics/help: Display all available control commands
These commands give you fine-grained control over context management, allowing you to proactively optimize token usage and refresh project documentation. See Control Commands Documentation for detailed usage.
Tool Ecosystem
- File Operations: Read, write, and edit files with line-range precision
- Shell Integration: Execute system commands with output capture
- Code Generation: Structured code generation with syntax awareness
- TODO Management: Read and write TODO lists with markdown checkbox format
- Computer Control (Experimental): Automate desktop applications
- Mouse and keyboard control
- macOS Accessibility API for native app automation (via
--macaxflag) - UI element inspection
- Screenshot capture and window management
- OCR text extraction from images and screen regions
- Window listing and identification
- Code Search: Embedded tree-sitter for syntax-aware code search (Rust, Python, JavaScript, TypeScript) - see Code Search Guide
- Final Output: Formatted result presentation
Provider Flexibility
- Support for multiple LLM providers through a unified interface
- Hot-swappable providers without code changes
- Provider-specific optimizations and feature support
- Local model support for offline operation
Task Automation
- Single-shot task execution for quick operations
- Iterative task mode for complex, multi-step workflows
- Automatic error recovery and retry logic
- Progress tracking and intermediate result handling
Language & Technology Stack
- Language: Rust (2021 edition)
- Async Runtime: Tokio for concurrent operations
- HTTP Client: Reqwest for API communications
- Serialization: Serde for JSON handling
- CLI Framework: Clap for command-line parsing
- Logging: Tracing for structured logging
- Local Models: llama.cpp with Metal acceleration support
Use Cases
G3 is designed for:
- Automated code generation and refactoring
- File manipulation and project scaffolding
- System administration tasks
- Data processing and transformation
- API integration and testing
- Documentation generation
- Complex multi-step workflows
- Desktop application automation and testing
Getting Started
Default Mode: Accumulative Autonomous
The default interactive mode now uses accumulative autonomous mode, which combines the best of interactive and autonomous workflows:
# Simply run g3 in any directory
g3
# You'll be prompted to describe what you want to build
# Each input you provide:
# 1. Gets added to accumulated requirements
# 2. Automatically triggers autonomous mode (coach-player loop)
# 3. Implements your requirements iteratively
# Example session:
requirement> create a simple web server in Python with Flask
# ... autonomous mode runs and implements it ...
requirement> add a /health endpoint that returns JSON
# ... autonomous mode runs again with both requirements ...
Other Modes
# Single-shot mode (one task, then exit)
g3 "implement a function to calculate fibonacci numbers"
# Traditional autonomous mode (reads requirements.md)
g3 --autonomous
# Traditional chat mode (simple interactive chat without autonomous runs)
g3 --chat
# Build the project
cargo build --release
# Run from the build directory
./target/release/g3
# Or copy both files to somewhere in your PATH (macOS only needs both files)
cp target/release/g3 ~/.local/bin/
cp target/release/libVisionBridge.dylib ~/.local/bin/ # macOS only
# Execute a task
g3 "implement a function to calculate fibonacci numbers"
WebDriver Browser Automation
G3 includes WebDriver support for browser automation tasks using Safari.
One-Time Setup (macOS only):
Safari Remote Automation must be enabled before using WebDriver tools. Run this once:
# Option 1: Use the provided script
./scripts/enable-safari-automation.sh
# Option 2: Enable manually
safaridriver --enable # Requires password
# Option 3: Enable via Safari UI
# Safari → Preferences → Advanced → Show Develop menu
# Then: Develop → Allow Remote Automation
For detailed setup instructions and troubleshooting, see WebDriver Setup Guide.
Usage: Run G3 with the --webdriver flag to enable browser automation tools.
macOS Accessibility API Tools
G3 includes support for controlling macOS applications via the Accessibility API, allowing you to automate native macOS apps.
Available Tools: macax_list_apps, macax_get_frontmost_app, macax_activate_app, macax_get_ui_tree, macax_find_elements, macax_click, macax_set_value, macax_get_value, macax_press_key
Setup: Enable with the --macax flag or in config with macax.enabled = true. Grant accessibility permissions:
- macOS: System Preferences → Security & Privacy → Privacy → Accessibility → Add your terminal app
For detailed documentation, see macOS Accessibility Tools Guide.
Note: This is particularly useful for testing and automating apps you're building with G3, as you can add accessibility identifiers to your UI elements.
Computer Control (Experimental)
G3 can interact with your computer's GUI for automation tasks:
Available Tools: mouse_click, type_text, find_element, take_screenshot, extract_text, find_text_on_screen, list_windows
Setup: Enable in config with computer_control.enabled = true and grant OS accessibility permissions:
- macOS: System Preferences → Security & Privacy → Accessibility
- Linux: Ensure X11 or Wayland access
- Windows: Run as administrator (first time only)
Session Logs
G3 automatically saves session logs for each interaction in the logs/ directory. These logs contain:
- Complete conversation history
- Token usage statistics
- Timestamps and session status
The logs/ directory is created automatically on first use and is excluded from version control.
License
MIT License - see LICENSE file for details
Contributing
G3 is an open-source project. Contributions are welcome! Please see CONTRIBUTING.md for guidelines.