# g3 Architecture **Last updated**: February 2025 **Source of truth**: Crate structure in `crates/`, `Cargo.toml`, `DESIGN.md`, `skills/` ## Purpose This document describes the internal architecture of g3, a modular AI coding agent built in Rust. It is intended for developers who want to understand, extend, or maintain the codebase. ## High-Level Overview g3 follows a **tool-first philosophy**: instead of just providing advice, it actively uses tools to read files, write code, execute commands, and complete tasks autonomously. ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ g3-cli │ │ g3-core │ │ g3-providers │ │ │ │ │ │ │ │ • CLI parsing │◄──►│ • Agent engine │◄──►│ • Anthropic │ │ • Interactive │ │ • Context mgmt │ │ • Databricks │ │ • Streaming MD │ │ • Tool system │ │ • OpenAI │ │ • Autonomous │ │ • Streaming │ │ • Embedded │ │ mode │ │ • Task exec │ │ (llama.cpp) │ │ │ │ • TODO mgmt │ │ • OAuth flow │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ └───────────────────────┼───────────────────────┘ │ ┌───────────────────────┼───────────────────────┐ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ g3-execution │ │ g3-config │ │ g3-planner │ │ │ │ │ │ │ │ • Code exec │ │ • TOML config │ │ • Requirements │ │ • Shell cmds │ │ • Env overrides │ │ • Git ops │ │ • Streaming │ │ • Provider │ │ • Planning │ │ • Error hdlg │ │ settings │ │ workflow │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ │ ┌─────────────────┐ │ │ │ g3-computer- │ │ └─────────────►│ control │◄─────────────┘ │ • Mouse/kbd │ │ • Screenshots │ │ • OCR/Vision │ │ • WebDriver │ │ • macOS Ax API │ └─────────────────┘ │ ┌───────────────────────┼───────────────────────┐ │ │ │ ┌─────────────────┐ │ studio │ │ • Worktree mgmt │ │ • Session mgmt │ └─────────────────┘ ``` ## Workspace Structure g3 is organized as a Rust workspace with 8 crates: ``` g3/ ├── src/main.rs # Entry point (delegates to g3-cli) ├── crates/ │ ├── g3-cli/ # Command-line interface and TUI │ ├── g3-core/ # Core agent engine and tools │ ├── g3-providers/ # LLM provider abstractions │ ├── g3-config/ # Configuration management │ ├── g3-execution/ # Code execution engine │ ├── g3-computer-control/ # Computer automation │ ├── g3-planner/ # Planning mode workflow │ └── studio/ # Multi-agent workspace manager ├── agents/ # Agent persona definitions ├── skills/ # Embedded skills (research, etc.) ├── logs/ # Session logs (auto-created) └── g3-plan/ # Planning artifacts ``` ## Crate Responsibilities ### g3-core (Central Hub) **Location**: `crates/g3-core/` **Purpose**: Core agent engine, tool system, and orchestration logic Key modules: - `lib.rs` - Main `Agent` struct and orchestration (~3400 lines) - `context_window.rs` - Token tracking and context management - `streaming_parser.rs` - Real-time LLM response parsing - `tool_definitions.rs` - JSON schema definitions for all tools - `tool_dispatch.rs` - Routes tool calls to implementations - `tools/` - Tool implementations (file ops, shell, vision, webdriver, etc.) - `error_handling.rs` - Error classification and recovery - `retry.rs` - Retry logic with exponential backoff - `prompts.rs` - System prompt generation - `code_search/` - Tree-sitter based code search - `skills/` - Agent Skills discovery, parsing, and extraction **Key types**: - `Agent` - Main agent struct, generic over UI output - `ContextWindow` - Manages conversation history and token limits - `StreamingToolParser` - Parses streaming LLM responses for tool calls - `ToolCall` - Represents a tool invocation ### g3-providers (LLM Abstraction) **Location**: `crates/g3-providers/` **Purpose**: Unified interface for multiple LLM backends Key modules: - `lib.rs` - `LLMProvider` trait and `ProviderRegistry` - `anthropic.rs` - Anthropic Claude API (~51k chars) - `databricks.rs` - Databricks Foundation Models (~58k chars) - `openai.rs` - OpenAI and compatible APIs (~18k chars) - `embedded.rs` - Local models via llama.cpp (~34k chars) - `oauth.rs` - OAuth authentication flow **Key traits**: ```rust #[async_trait] pub trait LLMProvider: Send + Sync { async fn complete(&self, request: CompletionRequest) -> Result; async fn stream(&self, request: CompletionRequest) -> Result; fn name(&self) -> &str; fn model(&self) -> &str; fn has_native_tool_calling(&self) -> bool; fn supports_cache_control(&self) -> bool; fn max_tokens(&self) -> u32; fn temperature(&self) -> f32; } ``` ### g3-cli (User Interface) **Location**: `crates/g3-cli/` **Purpose**: Command-line interface, TUI, and execution modes Key modules: - `lib.rs` - Main CLI entry point and mode dispatch - `interactive.rs` - Interactive REPL mode - `autonomous.rs` - Autonomous coach-player mode - `accumulative.rs` - Accumulative autonomous mode - `agent_mode.rs` - Specialized agent execution - `filter_json.rs` - JSON tool call filtering for display - `ui_writer_impl.rs` - Console output implementation - `streaming_markdown.rs` - Real-time markdown formatting **Execution modes**: 1. **Single-shot**: `g3 "task description"` - Execute one task and exit 2. **Interactive**: `g3` - REPL-style conversation (default) 3. **Autonomous**: `g3 --autonomous` - Coach-player feedback loop 4. **Accumulative**: Default interactive mode with autonomous runs 5. **Planning**: `g3 --planning` - Requirements-driven development 6. **Agent Mode**: `g3 --agent ` - Run specialized agent personas ### g3-config (Configuration) **Location**: `crates/g3-config/` **Purpose**: TOML-based configuration management Key structures: - `Config` - Root configuration - `ProvidersConfig` - Provider settings with named configs - `AgentConfig` - Agent behavior settings - `WebDriverConfig` - Browser automation settings - `MacAxConfig` - macOS Accessibility API settings **Configuration hierarchy** (highest priority last): 1. Default configuration 2. `~/.config/g3/config.toml` 3. `./g3.toml` 4. Environment variables (`G3_*`) 5. CLI arguments ### g3-execution (Code Execution) **Location**: `crates/g3-execution/` **Purpose**: Safe execution of shell commands and scripts Features: - Streaming output capture - Exit code tracking - Async execution via Tokio - Error handling and formatting ### g3-computer-control (Automation) **Location**: `crates/g3-computer-control/` **Purpose**: Cross-platform computer control and automation Key modules: - `platform/` - Platform-specific implementations (macOS, Linux, Windows) - `webdriver/` - Safari and Chrome WebDriver integration - `ocr/` - Text extraction (Tesseract, Apple Vision) **Platform support**: - **macOS**: Core Graphics, Cocoa, screencapture, Vision framework - **Linux**: X11/Xtest for input - **Windows**: Win32 APIs ### g3-planner (Planning Mode) **Location**: `crates/g3-planner/` **Purpose**: Requirements-driven development workflow Key modules: - `planner.rs` - Main planning state machine (~40k chars) - `state.rs` - Planning state management - `git.rs` - Git operations - `code_explore.rs` - Codebase exploration - `llm.rs` - LLM interactions for planning - `history.rs` - Planning history tracking **Workflow**: 1. Write requirements in `/g3-plan/new_requirements.md` 2. LLM refines requirements 3. Requirements renamed to `current_requirements.md` 4. Coach/player loop implements 5. Files archived with timestamps 6. Git commit with LLM-generated message ### studio (Multi-Agent Workspace Manager) **Location**: `crates/studio/` **Purpose**: Manage multiple g3 agent sessions using git worktrees Key modules: - `main.rs` - CLI commands (run, exec, list, status, accept, discard) - `git.rs` - Git worktree management - `session.rs` - Session metadata and status tracking Studio enables isolated agent sessions by creating git worktrees, allowing multiple agents to work on the same codebase without conflicts. ### Skills System (Extensible Capabilities) **Location**: `crates/g3-core/src/skills/` and `skills/` **Purpose**: Portable skill packages that extend agent capabilities g3 implements the [Agent Skills](https://agentskills.io) specification, allowing skills to be discovered from multiple locations and embedded into the binary for portability. Key modules in `crates/g3-core/src/skills/`: - `mod.rs` - Module exports and public API - `parser.rs` - SKILL.md frontmatter and body parsing - `discovery.rs` - Multi-location skill discovery with priority ordering - `embedded.rs` - Skills compiled into the binary via `include_str!` - `extraction.rs` - Script extraction to `.g3/bin/` with version tracking - `prompt.rs` - Generates `` XML for system prompt **Discovery Priority** (lowest to highest): 1. Embedded skills (compiled into binary) 2. Global: `~/.g3/skills/` 3. Extra paths from config 4. Workspace: `.g3/skills/` 5. Repo: `skills/` (highest priority, checked into git) **Embedded Skills**: Core skills are embedded at compile time using `include_str!`, ensuring g3 works anywhere without external files: ```rust // Currently empty - skills can be added here as needed static EMBEDDED_SKILLS: &[EmbeddedSkill] = &[ // Example: // EmbeddedSkill { // name: "example-skill", // skill_md: include_str!("../../../../skills/example-skill/SKILL.md"), // }, ]; ``` **Script Extraction**: Embedded scripts are extracted to `.g3/bin/` on first use: - Version tracking via content hash in `.g3/bin/