alex/g3

Files

Dhanji R. Prasanna f415dbb84b Fix ACD turn summary loss and add /dump command

ACD (Aggressive Context Dehydration) fixes:
- Fixed dehydrate_context() to extract turn summary from context window
  instead of using the passed-in final_response (which contained only
  the timing footer, not the actual LLM response)
- Removed final_response parameter from dehydrate_context() since it
  now self-extracts the last assistant message as the summary
- This ensures the actual turn summary is preserved after dehydration,
  not just the timing footer

New /dump command:
- Added /dump command to dump entire context window to tmp/ for debugging
- Shows message index, role, kind, content length, and full content
- Available in both console and machine modes

UTF-8 safety:
- Fixed truncate_to_word_boundary() to use character indices instead of
  byte indices, preventing panics on multi-byte UTF-8 characters
- Added UTF-8 string slicing guidance to AGENTS.md

Agent: g3

2026-01-12 05:13:02 +05:30

5.4 KiB

Raw Blame History

AGENTS.md - Machine Instructions for G3

Purpose: Machine-specific instructions for AI agents working with this codebase.
For project overview, architecture, and usage: See README.md

Critical Invariants

MUST Hold

Tool calls must be valid JSON - The streaming parser expects well-formed tool calls
Context window limits must be respected - Exceeding limits causes API errors
Provider trait implementations must be Send + Sync - Required for async runtime
Session IDs must be unique - Used for log file paths and TODO scoping
File paths in tools support tilde expansion - ~ expands to home directory
Streaming is preferred - Non-streaming requests block UI
Tool results are size-limited - Large outputs are truncated or thinned automatically

MUST NOT Do

Never block the async runtime - Use tokio::spawn for CPU-intensive work
Never store secrets in logs - API keys are redacted in error logs
Never modify files outside working directory without explicit permission
Never assume tool results fit in context - Large results are thinned automatically

Recommended Entry Points

For Understanding the System

src/main.rs - Entry point (trivial)
crates/g3-cli/src/lib.rs - CLI logic and execution modes
crates/g3-core/src/lib.rs - Agent struct and orchestration
crates/g3-providers/src/lib.rs - Provider trait definition

For Adding Features

New tool: crates/g3-core/src/tool_definitions.rs → crates/g3-core/src/tools/
New provider: crates/g3-providers/src/ → implement LLMProvider trait
New CLI mode: crates/g3-cli/src/lib.rs
New config option: crates/g3-config/src/lib.rs

For Debugging

Session logs: .g3/sessions/<session_id>/session.json
Error logs: logs/errors/
Context state: Use /stats command in interactive mode

Dangerous/Subtle Code Paths

Context Window Management (`g3-core/src/context_window.rs`)

Thinning: Automatically replaces large tool results with file references
Summarization: Compresses conversation history at 80% capacity
Token estimation: Uses character-based heuristics, not exact tokenization
Risk: Incorrect token estimates can cause context overflow

Streaming Parser (`g3-core/src/streaming_parser.rs`)

Parses LLM responses in real-time for tool calls
Must handle partial JSON across chunk boundaries
Risk: Malformed responses can cause parsing failures

Tool Dispatch (`g3-core/src/tool_dispatch.rs`)

Routes tool calls to implementations
Handles both native and JSON-based tool calling
Risk: Missing dispatch cases cause silent failures

Retry Logic (`g3-core/src/retry.rs`)

Exponential backoff with jitter
Different configs for interactive vs autonomous mode
Risk: Aggressive retries can hit rate limits harder

UTF-8 String Slicing (Throughout Codebase)

Rust string slices (&s[..n]) use byte indices, not character indices
Multi-byte UTF-8 characters (emoji, bullets •, ×, ⚡) cause panics if sliced mid-character
Risk: Runtime panic on any string containing non-ASCII characters

Fix: Use char_indices() to find byte boundaries:

let byte_idx = s.char_indices().nth(char_limit).map(|(i, _)| i).unwrap_or(s.len());
let truncated = &s[..byte_idx];

Danger zones: Display truncation, ACD stubs, user input handling

Do's and Don'ts for Automated Changes

Do

✅ Run cargo check after modifications
✅ Run cargo test before committing
✅ Update tool definitions when adding tools
✅ Add tests for new functionality
✅ Use existing patterns for similar features
✅ Keep functions under 80 lines
✅ Update documentation for user-facing changes

Don't

❌ Modify Cargo.toml dependencies without justification
❌ Add blocking code in async contexts
❌ Store sensitive data in plain text
❌ Ignore error handling
❌ Create deeply nested conditionals (>6 levels)
❌ Add external dependencies for simple tasks

Common Incorrect Assumptions

"All providers support tool calling" - Embedded models use JSON fallback
"Context window is unlimited" - Each provider has limits (4k-200k tokens)
"Tool results are always small" - File reads can return megabytes
"Sessions persist across runs" - Sessions are ephemeral by default
"All platforms are equal" - macOS has more features (Vision, Accessibility)
"String length equals character count" - s.len() returns bytes; use s.chars().count() for characters

Dependency Analysis Artifacts

The analysis/deps/ directory contains static analysis artifacts generated by the Euler agent:

File	Purpose
`graph.json`	Raw dependency graph data (crate and file-level edges with evidence)
`graph.summary.md`	Overview metrics: crate counts, edge counts, fan-in/fan-out rankings
`sccs.md`	Strongly Connected Components analysis (cycle detection via Tarjan's algorithm)
`layers.observed.md`	Mechanically-derived layer diagram showing crate hierarchy and intra-crate module structure
`hotspots.md`	Coupling hotspots: files/crates with disproportionate fan-in or fan-out (>2× average)
`limitations.md`	Known limitations of the static analysis (conditional compilation, macros, re-exports)

These artifacts are useful for understanding coupling, planning refactors, and identifying architectural boundaries.

5.4 KiB Raw Blame History Unescape Escape