Files
g3/AGENTS.md
Dhanji R. Prasanna d695f10604 Document dependency analysis artifacts in AGENTS.md
Added section explaining the analysis/deps/ directory contents:
- graph.json: Raw dependency graph data
- graph.summary.md: Overview metrics and rankings
- sccs.md: Cycle detection results
- layers.observed.md: Layer diagrams
- hotspots.md: Coupling hotspots
- limitations.md: Analysis limitations

Includes key findings from the Euler agent's static analysis.
2026-01-06 12:31:17 +11:00

227 lines
8.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AGENTS.md - Machine Instructions for G3
**Last updated**: January 2025
**Purpose**: Enable AI agents to work safely and effectively with this codebase
## System Overview
G3 is an AI coding agent built in Rust. It uses LLM providers to execute tasks through a tool-based interface. The codebase is organized as a Cargo workspace with 9 crates.
### Quick Reference
| Crate | Purpose | Stability |
|-------|---------|----------|
| `g3-core` | Agent engine, tools, context management | Stable |
| `g3-providers` | LLM provider abstractions | Stable |
| `g3-cli` | Command-line interface | Stable |
| `g3-config` | Configuration management | Stable |
| `g3-execution` | Code execution | Stable |
| `g3-computer-control` | Computer automation | Experimental |
| `g3-planner` | Planning mode | Stable |
| `g3-ensembles` | Multi-agent (flock) mode | Experimental |
| `g3-console` | Web monitoring console | Experimental |
## Critical Invariants
### MUST Hold
1. **Tool calls must be valid JSON** - The streaming parser expects well-formed tool calls
2. **Context window limits must be respected** - Exceeding limits causes API errors
3. **Provider trait implementations must be Send + Sync** - Required for async runtime
4. **Session IDs must be unique** - Used for log file paths and TODO scoping
5. **File paths in tools support tilde expansion** - `~` expands to home directory
### MUST NOT Do
1. **Never block the async runtime** - Use `tokio::spawn` for CPU-intensive work
2. **Never store secrets in logs** - API keys are redacted in error logs
3. **Never modify files outside working directory without explicit permission**
4. **Never assume tool results fit in context** - Large results are thinned automatically
## Recommended Entry Points
### For Understanding the System
1. `src/main.rs` - Entry point (trivial)
2. `crates/g3-cli/src/lib.rs` - CLI logic and execution modes
3. `crates/g3-core/src/lib.rs` - Agent struct and orchestration
4. `crates/g3-providers/src/lib.rs` - Provider trait definition
### For Adding Features
1. **New tool**: `crates/g3-core/src/tool_definitions.rs``crates/g3-core/src/tools/`
2. **New provider**: `crates/g3-providers/src/` → implement `LLMProvider` trait
3. **New CLI mode**: `crates/g3-cli/src/lib.rs`
4. **New config option**: `crates/g3-config/src/lib.rs`
### For Debugging
1. Session logs: `.g3/sessions/<session_id>/session.json`
2. Error logs: `logs/errors/`
3. Context state: Use `/stats` command in interactive mode
## Dangerous/Subtle Code Paths
### Context Window Management (`g3-core/src/context_window.rs`)
- **Thinning**: Automatically replaces large tool results with file references
- **Summarization**: Compresses conversation history at 80% capacity
- **Token estimation**: Uses character-based heuristics, not exact tokenization
- **Risk**: Incorrect token estimates can cause context overflow
### Streaming Parser (`g3-core/src/streaming_parser.rs`)
- Parses LLM responses in real-time for tool calls
- Must handle partial JSON across chunk boundaries
- **Risk**: Malformed responses can cause parsing failures
### Tool Dispatch (`g3-core/src/tool_dispatch.rs`)
- Routes tool calls to implementations
- Handles both native and JSON-based tool calling
- **Risk**: Missing dispatch cases cause silent failures
### Retry Logic (`g3-core/src/retry.rs`)
- Exponential backoff with jitter
- Different configs for interactive vs autonomous mode
- **Risk**: Aggressive retries can hit rate limits harder
## Performance Constraints
1. **Streaming is preferred** - Non-streaming requests block UI
2. **Tool results are size-limited** - Large outputs are truncated or thinned
3. **Concurrent tool calls** - Enabled by `allow_multiple_tool_calls` config
4. **Background processes** - Long-running commands use `background_process` tool
## Testing Strategy
### Test Locations
- Unit tests: `crates/*/tests/`
- Integration tests: `crates/*/tests/`
- Test fixtures: `examples/test_code/`
### Running Tests
```bash
# All tests
cargo test
# Specific crate
cargo test -p g3-core
# With output
cargo test -- --nocapture
```
### Test Considerations
- Provider tests may require API keys
- Computer control tests require OS permissions
- WebDriver tests require browser setup
## Do's and Don'ts for Automated Changes
### Do
- ✅ Run `cargo check` after modifications
- ✅ Run `cargo test` before committing
- ✅ Update tool definitions when adding tools
- ✅ Add tests for new functionality
- ✅ Use existing patterns for similar features
- ✅ Keep functions under 80 lines
- ✅ Update documentation for user-facing changes
### Don't
- ❌ Modify `Cargo.toml` dependencies without justification
- ❌ Add blocking code in async contexts
- ❌ Store sensitive data in plain text
- ❌ Ignore error handling
- ❌ Create deeply nested conditionals (>6 levels)
- ❌ Add external dependencies for simple tasks
## Common Incorrect Assumptions
1. **"All providers support tool calling"** - Embedded models use JSON fallback
2. **"Context window is unlimited"** - Each provider has limits (4k-200k tokens)
3. **"Tool results are always small"** - File reads can return megabytes
4. **"Sessions persist across runs"** - Sessions are ephemeral by default
5. **"All platforms are equal"** - macOS has more features (Vision, Accessibility)
## Architecture Decisions
See `DESIGN.md` for original design rationale.
Key decisions:
- **Rust for performance and safety** - Async runtime, memory safety
- **Workspace structure** - Separation of concerns, independent compilation
- **Provider abstraction** - Swap providers without code changes
- **Tool-first philosophy** - Agent acts through tools, not just advice
- **Session-scoped state** - TODO lists, logs tied to sessions
## File Structure Quick Reference
```
g3/
├── src/main.rs # Entry point
├── crates/
│ ├── g3-cli/src/
│ │ ├── lib.rs # CLI logic (~112k chars)
│ │ └── retro_tui.rs # Retro TUI mode
│ ├── g3-core/src/
│ │ ├── lib.rs # Agent struct (~3400 lines)
│ │ ├── context_window.rs # Context management
│ │ ├── tool_definitions.rs # Tool schemas
│ │ ├── tool_dispatch.rs # Tool routing
│ │ ├── tools/ # Tool implementations
│ │ ├── streaming_parser.rs # Response parsing
│ │ └── retry.rs # Retry logic
│ ├── g3-providers/src/
│ │ ├── lib.rs # Provider trait
│ │ ├── anthropic.rs # Anthropic Claude
│ │ ├── databricks.rs # Databricks
│ │ ├── openai.rs # OpenAI
│ │ └── embedded.rs # Local models
│ ├── g3-config/src/lib.rs # Configuration
│ ├── g3-planner/src/ # Planning mode
│ ├── g3-ensembles/src/ # Flock mode
│ └── g3-computer-control/src/ # Automation
├── agents/ # Agent personas
├── docs/ # Documentation
└── logs/ # Session logs
```
## Pointers to Documentation
- [Architecture](docs/architecture.md) - System design and data flow
- [Configuration](docs/configuration.md) - Config file format and options
- [Tools Reference](docs/tools.md) - All available tools
- [Providers Guide](docs/providers.md) - LLM provider setup
- [Control Commands](docs/CONTROL_COMMANDS.md) - Interactive commands
- [Code Search](docs/CODE_SEARCH.md) - Tree-sitter search guide
- [Flock Mode](docs/FLOCK_MODE.md) - Multi-agent development
## Dependency Analysis Artifacts
The `analysis/deps/` directory contains static analysis artifacts generated by the Euler agent:
| File | Purpose |
|------|--------|
| `graph.json` | Raw dependency graph data (crate and file-level edges with evidence) |
| `graph.summary.md` | Overview metrics: crate counts, edge counts, fan-in/fan-out rankings |
| `sccs.md` | Strongly Connected Components analysis (cycle detection via Tarjan's algorithm) |
| `layers.observed.md` | Mechanically-derived layer diagram showing crate hierarchy and intra-crate module structure |
| `hotspots.md` | Coupling hotspots: files/crates with disproportionate fan-in or fan-out (>2× average) |
| `limitations.md` | Known limitations of the static analysis (conditional compilation, macros, re-exports) |
**Key findings:**
- No cycles detected at crate or file level (strict DAG structure)
- `g3-config` and `g3-providers` are the most depended-upon crates (fan-in: 4)
- `g3-cli` has highest fan-out (5 crate dependencies) as the composition root
- `ui_writer.rs` is the most imported file (11 dependents)
- `g3-core/src/lib.rs` has highest fan-out (25 module declarations)
These artifacts are useful for understanding coupling, planning refactors, and identifying architectural boundaries.