Standardize project name to lowercase 'g3' throughout documentation, comments, and configuration files. Environment variables (G3_*) are unchanged as they follow the uppercase convention.
14 KiB
g3 Architecture
Last updated: January 2025
Source of truth: Crate structure in crates/, Cargo.toml, DESIGN.md
Purpose
This document describes the internal architecture of g3, a modular AI coding agent built in Rust. It is intended for developers who want to understand, extend, or maintain the codebase.
High-Level Overview
g3 follows a tool-first philosophy: instead of just providing advice, it actively uses tools to read files, write code, execute commands, and complete tasks autonomously.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ g3-cli │ │ g3-core │ │ g3-providers │
│ │ │ │ │ │
│ • CLI parsing │◄──►│ • Agent engine │◄──►│ • Anthropic │
│ • Interactive │ │ • Context mgmt │ │ • Databricks │
│ • Streaming MD │ │ • Tool system │ │ • OpenAI │
│ • Autonomous │ │ • Streaming │ │ • Embedded │
│ mode │ │ • Task exec │ │ (llama.cpp) │
│ │ │ • TODO mgmt │ │ • OAuth flow │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ g3-execution │ │ g3-config │ │ g3-planner │
│ │ │ │ │ │
│ • Code exec │ │ • TOML config │ │ • Requirements │
│ • Shell cmds │ │ • Env overrides │ │ • Git ops │
│ • Streaming │ │ • Provider │ │ • Planning │
│ • Error hdlg │ │ settings │ │ workflow │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ ┌─────────────────┐ │
│ │ g3-computer- │ │
└─────────────►│ control │◄─────────────┘
│ • Mouse/kbd │
│ • Screenshots │
│ • OCR/Vision │
│ • WebDriver │
│ • macOS Ax API │
└─────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────────────────┐ ┌─────────────────┐
│ g3-ensembles │ │ studio │
│ • Flock mode │ │ │
│ • Multi-agent │ │ • Worktree mgmt │
│ • Parallel dev │ │ • Session mgmt │
└─────────────────┘ └─────────────────┘
Workspace Structure
g3 is organized as a Rust workspace with 9 crates:
g3/
├── src/main.rs # Entry point (delegates to g3-cli)
├── crates/
│ ├── g3-cli/ # Command-line interface and TUI
│ ├── g3-core/ # Core agent engine and tools
│ ├── g3-providers/ # LLM provider abstractions
│ ├── g3-config/ # Configuration management
│ ├── g3-execution/ # Code execution engine
│ ├── g3-computer-control/ # Computer automation
│ ├── g3-planner/ # Planning mode workflow
│ ├── g3-ensembles/ # Multi-agent (flock) mode
│ └── studio/ # Multi-agent workspace manager
├── agents/ # Agent persona definitions
├── logs/ # Session logs (auto-created)
└── g3-plan/ # Planning artifacts
Crate Responsibilities
g3-core (Central Hub)
Location: crates/g3-core/
Purpose: Core agent engine, tool system, and orchestration logic
Key modules:
lib.rs- MainAgentstruct and orchestration (~3400 lines)context_window.rs- Token tracking and context managementstreaming_parser.rs- Real-time LLM response parsingtool_definitions.rs- JSON schema definitions for all toolstool_dispatch.rs- Routes tool calls to implementationstools/- Tool implementations (file ops, shell, vision, webdriver, etc.)error_handling.rs- Error classification and recoveryretry.rs- Retry logic with exponential backoffprompts.rs- System prompt generationcode_search/- Tree-sitter based code search
Key types:
Agent<W: UiWriter>- Main agent struct, generic over UI outputContextWindow- Manages conversation history and token limitsStreamingToolParser- Parses streaming LLM responses for tool callsToolCall- Represents a tool invocation
g3-providers (LLM Abstraction)
Location: crates/g3-providers/
Purpose: Unified interface for multiple LLM backends
Key modules:
lib.rs-LLMProvidertrait andProviderRegistryanthropic.rs- Anthropic Claude API (~51k chars)databricks.rs- Databricks Foundation Models (~58k chars)openai.rs- OpenAI and compatible APIs (~18k chars)embedded.rs- Local models via llama.cpp (~34k chars)oauth.rs- OAuth authentication flow
Key traits:
#[async_trait]
pub trait LLMProvider: Send + Sync {
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse>;
async fn stream(&self, request: CompletionRequest) -> Result<CompletionStream>;
fn name(&self) -> &str;
fn model(&self) -> &str;
fn has_native_tool_calling(&self) -> bool;
fn supports_cache_control(&self) -> bool;
fn max_tokens(&self) -> u32;
fn temperature(&self) -> f32;
}
g3-cli (User Interface)
Location: crates/g3-cli/
Purpose: Command-line interface, TUI, and execution modes
Key modules:
lib.rs- Main CLI entry point and mode dispatchinteractive.rs- Interactive REPL modeautonomous.rs- Autonomous coach-player modeaccumulative.rs- Accumulative autonomous modeagent_mode.rs- Specialized agent executionfilter_json.rs- JSON tool call filtering for displayui_writer_impl.rs- Console output implementationstreaming_markdown.rs- Real-time markdown formatting
Execution modes:
- Single-shot:
g3 "task description"- Execute one task and exit - Interactive:
g3- REPL-style conversation (default) - Autonomous:
g3 --autonomous- Coach-player feedback loop - Accumulative: Default interactive mode with autonomous runs
- Planning:
g3 --planning- Requirements-driven development - Agent Mode:
g3 --agent <name>- Run specialized agent personas
g3-config (Configuration)
Location: crates/g3-config/
Purpose: TOML-based configuration management
Key structures:
Config- Root configurationProvidersConfig- Provider settings with named configsAgentConfig- Agent behavior settingsWebDriverConfig- Browser automation settingsMacAxConfig- macOS Accessibility API settings
Configuration hierarchy (highest priority last):
- Default configuration
~/.config/g3/config.toml./g3.toml- Environment variables (
G3_*) - CLI arguments
g3-execution (Code Execution)
Location: crates/g3-execution/
Purpose: Safe execution of shell commands and scripts
Features:
- Streaming output capture
- Exit code tracking
- Async execution via Tokio
- Error handling and formatting
g3-computer-control (Automation)
Location: crates/g3-computer-control/
Purpose: Cross-platform computer control and automation
Key modules:
platform/- Platform-specific implementations (macOS, Linux, Windows)webdriver/- Safari and Chrome WebDriver integrationocr/- Text extraction (Tesseract, Apple Vision)
Platform support:
- macOS: Core Graphics, Cocoa, screencapture, Vision framework
- Linux: X11/Xtest for input
- Windows: Win32 APIs
g3-planner (Planning Mode)
Location: crates/g3-planner/
Purpose: Requirements-driven development workflow
Key modules:
planner.rs- Main planning state machine (~40k chars)state.rs- Planning state managementgit.rs- Git operationscode_explore.rs- Codebase explorationllm.rs- LLM interactions for planninghistory.rs- Planning history tracking
Workflow:
- Write requirements in
<codepath>/g3-plan/new_requirements.md - LLM refines requirements
- Requirements renamed to
current_requirements.md - Coach/player loop implements
- Files archived with timestamps
- Git commit with LLM-generated message
g3-ensembles (Multi-Agent)
Location: crates/g3-ensembles/
Purpose: Parallel multi-agent development (Flock mode)
Key modules:
flock.rs- Flock orchestration (~43k chars)status.rs- Agent status tracking
Flock mode enables parallel development by spawning multiple agent instances working on different parts of a project.
studio (Multi-Agent Workspace Manager)
Location: crates/studio/
Purpose: Manage multiple g3 agent sessions using git worktrees
Key modules:
main.rs- CLI commands (run, exec, list, status, accept, discard)git.rs- Git worktree managementsession.rs- Session metadata and status tracking
Studio enables isolated agent sessions by creating git worktrees, allowing multiple agents to work on the same codebase without conflicts.
Data Flow
Request Flow
User Input
│
▼
┌─────────────┐
│ g3-cli │ Parse input, determine mode
└─────────────┘
│
▼
┌─────────────┐
│ g3-core │ Add to context window
│ Agent │ Build completion request
└─────────────┘
│
▼
┌─────────────┐
│ g3-providers│ Send to LLM provider
│ Registry │ Stream response
└─────────────┘
│
▼
┌─────────────┐
│ g3-core │ Parse streaming response
│ Parser │ Detect tool calls
└─────────────┘
│
▼
┌─────────────┐
│ g3-core │ Execute tools
│ Tools │ Return results
└─────────────┘
│
▼
┌─────────────┐
│ g3-core │ Add results to context
│ Agent │ Continue or complete
└─────────────┘
Context Window Management
The ContextWindow struct manages conversation history with intelligent token tracking:
- Token Tracking: Monitors usage as percentage of provider's context limit
- Context Thinning: At 50%, 60%, 70%, 80% thresholds, replaces large tool results with file references
- Auto-Compaction: At 80% capacity, triggers conversation compaction
- Provider Adaptation: Adjusts to different model context windows (4k to 200k+ tokens)
Error Handling
g3 implements comprehensive error handling:
- Error Classification: Distinguishes recoverable vs non-recoverable errors
- Automatic Retry: Exponential backoff with jitter for:
- Rate limits (HTTP 429)
- Network errors
- Server errors (HTTP 5xx)
- Timeouts
- Error Logging: Detailed logs saved to
logs/errors/ - Graceful Degradation: Continues when possible, fails gracefully when not
Session Management
Sessions are tracked in .g3/sessions/<session_id>/:
session.json- Full conversation history and metadatatodo.g3.md- Session-scoped TODO list- Context summaries and thinned content
Legacy logs are stored in logs/g3_session_*.json.
Extension Points
Adding a New Tool
- Add tool definition in
g3-core/src/tool_definitions.rs - Implement handler in
g3-core/src/tools/ - Add dispatch case in
g3-core/src/tool_dispatch.rs - Update system prompt if needed in
g3-core/src/prompts.rs
Adding a New Provider
- Implement
LLMProvidertrait ing3-providers/src/ - Add configuration struct in
g3-config/src/lib.rs - Register provider in
g3-core/src/lib.rs(innew_with_mode_and_readme) - Update documentation
Adding a New Execution Mode
- Add CLI arguments in
g3-cli/src/lib.rs - Implement mode logic in the CLI
- May require new agent methods in
g3-core
Key Files for Understanding
Start reading here:
src/main.rs- Entry point (trivial, delegates to g3-cli)crates/g3-cli/src/lib.rs- CLI and execution modescrates/g3-core/src/lib.rs- Agent implementationcrates/g3-providers/src/lib.rs- Provider trait and registrycrates/g3-core/src/tool_definitions.rs- Available toolscrates/g3-config/src/lib.rs- Configuration structuresDESIGN.md- Original design document
Dependencies
Key external dependencies:
- tokio: Async runtime
- reqwest: HTTP client for API calls
- serde/serde_json: Serialization
- clap: CLI argument parsing
- tree-sitter: Syntax-aware code search
- llama_cpp: Local model inference (with Metal acceleration)
- fantoccini: WebDriver client