Solves the tautology problem where the LLM would write invariants after
implementation, making them match what was done rather than constrain it.
Changes:
- plan_write now accepts 'rulespec' parameter
- New plans REQUIRE rulespec (fails with helpful error if missing)
- Plan updates don't require rulespec (backward compatible)
- Rulespec is parsed, validated, and written atomically with plan
- Updated system prompt with clear examples for new vs update
- Updated tool definition schema
- Updated all affected tests
New flow: task → plan+rulespec → user reviews BOTH → approve → implement
Clip summary text and other long fields to fit terminal width:
- Clip display_summary in print_tool_compact (e.g., "47 lines (2.0k chars)")
- Account for header_suffix length when compressing paths in print_tool_output_header
- Clip TODO item lines in print_todo_compact
- Clip plan item descriptions, evidence, touches, checks, and paths in print_plan_compact
- Replace hardcoded 70/40 char limits with dynamic terminal-width-based clipping
All clipping uses clip_line() which handles UTF-8 safely and adds ellipsis.
When a plan reaches a terminal state (all items done or blocked) in
interactive mode, automatically exit plan mode and return to normal
prompt.
Changes:
- Add Agent::is_plan_terminal() method to check if plan is complete
- Add check_and_exit_plan_mode_if_terminal() helper in interactive.rs
- Call the helper after each execute_user_input() to detect completion
Fixes issue where plan mode prompt ' >> ' persisted after plan completion.
- Add terminal_width module with get_terminal_width(), clip_line(),
compress_path(), and compress_command() utilities
- Update ConsoleUiWriter to use dynamic terminal width for all tool output
- Tool output lines are clipped to fit without wrapping
- Tool headers use semantic compression (paths preserve filename,
commands clip from right)
- 4-character right margin for visual clarity
- Minimum 40 columns, default 80 when terminal size unavailable
- All truncation is UTF-8 safe (char counting, not byte slicing)
- Add 13 unit tests for terminal width utilities
- Fix build warnings: add #[allow(dead_code)] to unused deserialization fields
- Fix plan approval gate bug: block file changes when no plan exists (not just
when plan exists but is unapproved)
- Add "Create a plan: " prefix to first user message in plan mode
- Add prepare_plan_mode_input() helper function for testability
- Reset is_first_plan_message flag when entering plan mode via /plan command
- Add tests for approval gate (no plan + no changes, no plan + changes)
- Add tests for prepare_plan_mode_input (happy, negative, boundary cases)
- Add format_envelope_markdown() function in invariants.rs for rich markdown
formatting of ActionEnvelope facts
- Add format_yaml_value_markdown() helper for recursive YAML value display
- Update execute_plan_read() to append rulespec and envelope sections
- Update execute_plan_write() to append envelope section alongside rulespec
- Add 3 tests for format_envelope_markdown (empty, with facts, null values)
When plan_read or plan_write is called, the output now includes:
- Plan YAML (as before)
- Rulespec section (if rulespec.yaml exists) with invariants grouped by source
- Envelope section (if envelope.yaml exists) with facts in readable format
Missing files show placeholder text rather than errors.
- Update ChecksCompact to use Vec<CheckCompact> for negative/boundary fields
- Add progress bar visualization showing done/doing/blocked/todo counts
- Show evidence for done items, checks for active items
- Display all negative and boundary checks (not just first)
- Add proper tree structure with └/├ prefixes
- Truncate long descriptions and evidence paths
- Add file path display with 📄 icon
Document the new Skills system introduced in recent commits:
- docs/architecture.md: Add Skills System section with discovery
priority, embedded skills, script extraction, and key types
- docs/skills.md: New comprehensive guide covering SKILL.md format,
discovery priority, embedded skills, research skill usage, and
troubleshooting
- README.md: Update Agent Skills section with correct priority order,
add embedded skills info, research skill usage, and link to Skills
Guide in Documentation Map
- AGENTS.md: Add skill creation to Adding Features, skill extraction
to Dangerous Code Paths, and new Skills System Entry Points section
All documentation links validated - no broken links or orphan files.
Agent: lamport
- Rewrite SKILL.md with inline instructions to spawn g3 --agent scout directly
- Extend read_file to handle embedded skill paths (<embedded:name>/SKILL.md)
- Remove scripts field from EmbeddedSkill struct (no longer needed)
- Delete extraction.rs module (was only for script extraction)
- Delete g3-research bash script
- Remove obsolete Async Research Tool section from workspace memory
Skills are now fully portable - they work when g3 is installed as a
binary without access to source files. Agents can read embedded skill
content via read_file with the special <embedded:...> path syntax.
- Remove is_embedded_skill() from discovery.rs (unused)
- Remove get_embedded_skills_map() from embedded.rs (unused)
- Remove associated tests for deleted functions
- Inline path check in test_repo_overrides_embedded test
This eliminates dead code warnings and reduces module surface area
without changing any behavior.
Agent: fowler
Focused analysis on past 10 commits covering:
- New skills module in g3-core (parser, discovery, prompt, embedded, extraction)
- Research tool externalized to skills/research/ skill
- SkillsConfig added to g3-config
- SDLC pipeline state moved to .g3/sdlc/
Key findings:
- 4 crates changed, 29 files affected (8 added, 2 deleted, 19 modified)
- No dependency cycles detected
- Clean DAG structure in new skills module
- Cross-crate coupling via g3-core::skills and g3-config::SkillsConfig
- Compile-time coupling to skills/research/ via include_str!
Agent: euler
When a Rust-only workspace was detected, the Language-Specific Guidance
header was appearing with no content because Rust has an empty prompt
string (agent-specific prompts handle Rust instead).
The fix filters out empty prompt strings in get_language_prompts_for_workspace()
so the header only appears when there's actual guidance content.
Added test to verify Rust-only workspaces return None.
- Web Research instructions now come from skills/research/SKILL.md
- Skills are dynamically loaded and injected via generate_skills_prompt()
- Remove test_both_prompts_have_web_research test (no longer applicable)
- Remove unused G3Status::research_complete() function
This completes the externalization of research as a skill.
Replaces the built-in research/research_status tools with a portable
skill-based approach:
- Add embedded skills infrastructure (skills compiled into binary)
- Add repo-local skills/ directory support (highest priority)
- Create research skill with SKILL.md and g3-research shell script
- Script extraction to .g3/bin/ with version tracking
- Filesystem-based handoff via .g3/research/<id>/status.json
- Remove PendingResearchManager and all research tool code
- Update system prompt to reference skill instead of tool
Benefits:
- No special tool infrastructure needed (just shell + read_file)
- Context-efficient (reports stay on disk until needed)
- Crash-resilient (state persisted to filesystem)
- Portable (skill can be overridden per-workspace)
Breaking change: research tool calls now return a deprecation message
pointing to the research skill.
The --resume flag was being ignored when --agent and --chat flags were
used together. The if-else chain checked for chat mode first and
immediately returned None, skipping the --resume check entirely.
Reordered the logic to check flags.resume first, ensuring explicit
--resume is always honored regardless of other flags.
Fixes: --resume not working with --agent --chat
- Add merge step before worktree cleanup when pipeline completes
- On success with commits: merge to main, then cleanup
- On failure: preserve worktree for debugging, print path
- On merge conflict: preserve worktree, print resolution instructions
- Move pipeline.json from analysis/sdlc/ to .g3/sdlc/ (gitignored)
- Remove the interactive prompt that asked users to resume in-progress sessions
- Remove unused new_session parameter from run_interactive()
- Remove unused info_inline() function from G3Status
- Explicit --resume <session_id> flag still works
CLI starts in plan mode by default (when not in agent mode), but was not
calling agent.set_plan_mode(true) at initialization. This meant the gate
check would not run until the user explicitly entered plan mode via /plan.
- Move system prompt for native tool calling models to prompts/system/native.md
- Use include_str! to embed at compile time
- Remove concatenated SHARED_* string constants
- Prompt is now readable/editable as a complete markdown document
- Non-native prompt still uses Rust constants (acceptable for now)
- Add in_plan_mode flag to Agent struct
- Add set_plan_mode() and is_plan_mode() methods
- Gate check now only runs when in_plan_mode is true
- CLI calls set_plan_mode(true) on /plan command and EnterPlanMode
- CLI calls set_plan_mode(false) on approval and CTRL-D exit
- Update integration test to enable plan mode
- Fix test YAML to use Vec<Check> for negative/boundary checks
Change Plan Mode to allow multiple negative and boundary checks per item,
while keeping happy path as a single check.
Schema change:
- checks.negative: Check -> Vec<Check> (>=1 required)
- checks.boundary: Check -> Vec<Check> (>=1 required)
- checks.happy: Check (unchanged, single)
This better reflects real-world tasks where there are often multiple
error conditions and edge cases worth tracking.
Changes:
- Update Checks struct to use Vec<Check> for negative/boundary
- Update validation to require at least 1 of each
- Update prompts and tool definitions with new array syntax
- Add 4 new tests for multi-check scenarios
- Add check_plan_approval_gate() in tools/plan.rs that runs after each tool call
- Detects file changes via git status --porcelain when plan exists but not approved
- Reverts changes: git checkout for modified files, rm for new untracked files
- Returns blocking message instructing LLM to create/approve plan first
- Add ApprovalGateResult enum with Allowed/Blocked/NotGitRepo variants
- Add set_session_id() and set_working_dir() methods on Agent for testing
- Add integration test using MockProvider to simulate blocked write_file
Implements a pipeline that orchestrates 7 g3 agents in sequence:
1. euler - dependency graph and hotspots analysis
2. breaker - whitebox exploration and edge-case discovery
3. hopper - deep testing and regression integrity
4. fowler - refactoring to deduplicate and reduce complexity
5. carmack - in-place rewriting for readability and concision
6. lamport - human-readable documentation and validation
7. huffman - semantic compression of memory
Features:
- Commit cursor tracking (--from flag to set starting point)
- Crash recovery (resumes from last incomplete stage)
- Git worktree isolation for all pipeline work
- Visual pipeline display with status icons
- Summary generation saved to .g3/sessions/sdlc/
- Pipeline state persisted to analysis/sdlc/pipeline.json
CLI:
- studio sdlc run [-c N] [--from COMMIT]
- studio sdlc status
- studio sdlc reset
Also adds huffman agent to embedded agents list.
- Add --resume CLI flag that conflicts with --new-session
- Add load_continuation_by_id() to load sessions by full or partial ID
- Support loading from latest.json or falling back to session.json
- Handle --resume in both normal and agent modes
- Agent mode validates session belongs to correct agent
Adds rulespec.yaml and envelope.yaml support for machine-readable
invariant checking during plan completion.
- Add invariants module with Rulespec, ActionEnvelope, and evaluation logic
- Add Invariants section to system prompt with workflow instructions
- Show rulespec/envelope file status in plan verification output
- Rulespec written during planning (captures constraints from task)
- Envelope written after implementation (documents what was built)
Implements the Agent Skills specification (https://agentskills.io) for
portable skill packages that give the agent new capabilities.
Changes:
- Add skills module with SKILL.md parser (YAML frontmatter + markdown body)
- Implement skill discovery from ~/.g3/skills/, config extra_paths, and .g3/skills/
- Generate <available_skills> XML for system prompt injection
- Add SkillsConfig to g3-config with enabled flag and extra_paths
- Wire skills discovery into CLI startup
- Add 29 unit tests for parser, discovery, and prompt generation
- Update README with Agent Skills documentation
Skill locations (priority order):
1. ~/.g3/skills/ (global)
2. Config extra_paths
3. .g3/skills/ (workspace, highest priority)
At startup, g3 scans skill directories and injects a summary into the
system prompt. When the agent needs a skill, it reads the full SKILL.md
using the read_file tool.
- Add auto-approval logic in execute_plan_write() when ctx.is_autonomous is true
- Update system prompt to document auto-approval behavior
- Plans still require explicit approval in interactive mode
Added plan_approve to the compact tool list in format_tool_result_summary()
so it displays in the same format as other tools like read_file and write_file.
The format_plan_approve_summary() function already existed but was never
called because plan_approve was missing from the matches! block.
- Start g3 in plan mode with ' >>' prompt and welcome message
- Add is_approval_input() to detect 'approve', 'a', 'yes', etc. and misspellings
- Allow trailing punctuation (!, ., ,) on approval words
- Call plan_approve tool directly without LLM when approval detected
- Add synthetic assistant message after approval for LLM context
- Exit plan mode after successful approval, return to 'g3>' prompt
- CTRL-D in plan mode exits plan mode first, then exits g3
- /plan command enters plan mode and shows welcome message
- Agent mode (--agent) does not start in plan mode
- Add CommandResult enum to signal plan mode entry from commands
- Fix vertical bar continuation: │ continues all the way down, only the
very last sub-line (boundary of last item) gets └
- Add visual gap before plan file path and change 📄 to ->
- Dedent file path to align with tree root
- Fix plan_approve to use proper compact tool format (was missing from
is_compact_tool matches! in print_tool_compact, causing it to fall
through to regular output with | prefix)
- Update command matching from /feature to /plan in commands.rs
- Update help text, usage message, and example
- Update workspace memory references
- /feature is no longer recognized (completely removed)
Fixes two bugs in the input formatter:
1. Single/double quote regex now requires word boundaries:
- Contractions like it's, don't, won't no longer trigger highlighting
- Only properly quoted text like 'special' or "hello" gets cyan
- Mixed input like "it's a 'test' case" only highlights 'test'
2. Visual line calculation fix for exact terminal width:
- When text exactly fills terminal width, cursor wraps to next line
- Added +1 adjustment to account for this edge case
- Extracted calculate_visual_lines() for testability
Added 9 new tests covering all edge cases.
Plan tools (plan_read, plan_write) now display with elegant tree-style
formatting similar to the old todo_write UI:
- State indicators: □ (todo), ◐ (doing), ■ (done), ⊘ (blocked)
- Tree prefixes (├/└) for items with child details
- Strikethrough for completed items
- Shows touches and all three checks (happy/negative/boundary)
- Displays plan file path link at the end
plan_approve uses compact single-line format like read_file:
- Shows approval status and revision number
- Handles already-approved and error cases
Changes:
- Add print_plan_compact() to UiWriter trait with default impl
- Implement print_plan_compact() in ConsoleUiWriter
- Call print_plan_compact() from execute_plan_read/write
- Add plan_read/plan_write to is_self_handled_tool()
- Add plan_approve to is_compact_tool() with format_plan_approve_summary()
- Add serde_yaml dependency to g3-cli
Adds a verification system that checks evidence in completed plan items:
- Evidence parsing: supports code locations (file:line, file:line-line, file only)
and test references (file::test_name)
- Code location verification: checks file exists, validates line numbers in range
- Test reference verification: checks test file exists, searches for fn pattern
- Verification results: Verified, Warning, Error, Skipped statuses
- Loud output formatting with emoji indicators for warnings/errors
- Integration with execute_plan_write(): runs when plan is complete and approved
- 12 new unit tests covering parsing and verification
Warnings are advisory (don't block), errors are loud but also don't block.
Blocked items are skipped during verification.
Plan Mode is a cognitive forcing system that requires reasoning about:
- Happy path
- Negative case
- Boundary condition
New tools:
- plan_read: Read current plan for session
- plan_write: Create/update plan with YAML content (validates structure)
- plan_approve: Mark current revision as approved
New command:
- /feature <description>: Start Plan Mode for a new feature
Plan schema requires:
- plan_id, revision, approved_revision
- items with id, description, state, touches, checks (happy/negative/boundary)
- evidence and notes required when marking items done
Verification:
- plan_verify() called automatically when all items are done/blocked
Removed:
- todo_read, todo_write tools
- todo.rs module and related tests