Restores the research tool that was previously externalized as a skill:
- Add pending_research.rs: PendingResearchManager with thread-safe task tracking
- Add tools/research.rs: execute_research (async), execute_research_status
- Add research/research_status tool definitions with exclude_research config
- Integrate PendingResearchManager into Agent and ToolContext
- Inject completed research results in streaming loop
Remove research skill:
- Clear EMBEDDED_SKILLS array in embedded.rs
- Delete skills/research/ directory
- Update all tests expecting embedded research skill
- Update docs and memory to reflect the change
The research tool now:
- Spawns scout agent in background tokio task
- Returns immediately with research_id
- Automatically injects results into conversation when ready
- Supports status checks via research_status tool
The <location> field in the skills XML prompt was being XML-escaped,
converting <embedded:research>/SKILL.md to <embedded:research>/SKILL.md.
When the LLM tried to use read_file with this escaped path, it would fail.
Changes:
- Remove escape_xml() call from location field in prompt.rs
- Add fallback handling for escaped paths in try_read_embedded_skill()
- Add tests for both prompt generation and read_file handling
Fixes embedded skill loading for agents like butler running outside the g3 repo.
Removed redundant and vague content from prompts/system/native.md:
- Simplified intro from 17 lines to 3 lines
- Reduced Code Search section to one line
- Removed duplicate Plan Mode example (kept one)
- Removed Action Envelope section (rarely used correctly)
- Removed verbose Memory Format details (tool description covers it)
- Removed Response Guidelines (obvious to modern LLMs)
Size: 8,620 chars -> 4,498 chars
Also updated:
- G3_IDENTITY_LINE constant for agent mode compatibility
- Test assertions to check for new prompt markers
- System prompt validation to use new marker string
The loaded status line (✓ AGENTS.md ✓ Memory) already indicates that
AGENTS.md was loaded, so the separate '>> AGENTS.md - Machine Instructions'
heading line was redundant.
- Remove print_project_heading() function from display.rs
- Remove extract_project_heading call from interactive.rs
- Clean up unused imports
The shell tool output line was wrapping because update_tool_output_line
clipped the content without reserving space for the suffix that gets
appended later (line count + timing info).
Added suffix_overhead of 30 chars for shell tools to reserve space for:
- " (9999 lines)" = ~13 chars
- " | 99999 ◉ 999ms" = ~17 chars
This ensures the complete line fits within terminal width without wrapping.
Solves the tautology problem where the LLM would write invariants after
implementation, making them match what was done rather than constrain it.
Changes:
- plan_write now accepts 'rulespec' parameter
- New plans REQUIRE rulespec (fails with helpful error if missing)
- Plan updates don't require rulespec (backward compatible)
- Rulespec is parsed, validated, and written atomically with plan
- Updated system prompt with clear examples for new vs update
- Updated tool definition schema
- Updated all affected tests
New flow: task → plan+rulespec → user reviews BOTH → approve → implement
Clip summary text and other long fields to fit terminal width:
- Clip display_summary in print_tool_compact (e.g., "47 lines (2.0k chars)")
- Account for header_suffix length when compressing paths in print_tool_output_header
- Clip TODO item lines in print_todo_compact
- Clip plan item descriptions, evidence, touches, checks, and paths in print_plan_compact
- Replace hardcoded 70/40 char limits with dynamic terminal-width-based clipping
All clipping uses clip_line() which handles UTF-8 safely and adds ellipsis.
When a plan reaches a terminal state (all items done or blocked) in
interactive mode, automatically exit plan mode and return to normal
prompt.
Changes:
- Add Agent::is_plan_terminal() method to check if plan is complete
- Add check_and_exit_plan_mode_if_terminal() helper in interactive.rs
- Call the helper after each execute_user_input() to detect completion
Fixes issue where plan mode prompt ' >> ' persisted after plan completion.
- Add terminal_width module with get_terminal_width(), clip_line(),
compress_path(), and compress_command() utilities
- Update ConsoleUiWriter to use dynamic terminal width for all tool output
- Tool output lines are clipped to fit without wrapping
- Tool headers use semantic compression (paths preserve filename,
commands clip from right)
- 4-character right margin for visual clarity
- Minimum 40 columns, default 80 when terminal size unavailable
- All truncation is UTF-8 safe (char counting, not byte slicing)
- Add 13 unit tests for terminal width utilities
- Fix build warnings: add #[allow(dead_code)] to unused deserialization fields
- Fix plan approval gate bug: block file changes when no plan exists (not just
when plan exists but is unapproved)
- Add "Create a plan: " prefix to first user message in plan mode
- Add prepare_plan_mode_input() helper function for testability
- Reset is_first_plan_message flag when entering plan mode via /plan command
- Add tests for approval gate (no plan + no changes, no plan + changes)
- Add tests for prepare_plan_mode_input (happy, negative, boundary cases)
- Add format_envelope_markdown() function in invariants.rs for rich markdown
formatting of ActionEnvelope facts
- Add format_yaml_value_markdown() helper for recursive YAML value display
- Update execute_plan_read() to append rulespec and envelope sections
- Update execute_plan_write() to append envelope section alongside rulespec
- Add 3 tests for format_envelope_markdown (empty, with facts, null values)
When plan_read or plan_write is called, the output now includes:
- Plan YAML (as before)
- Rulespec section (if rulespec.yaml exists) with invariants grouped by source
- Envelope section (if envelope.yaml exists) with facts in readable format
Missing files show placeholder text rather than errors.
- Update ChecksCompact to use Vec<CheckCompact> for negative/boundary fields
- Add progress bar visualization showing done/doing/blocked/todo counts
- Show evidence for done items, checks for active items
- Display all negative and boundary checks (not just first)
- Add proper tree structure with └/├ prefixes
- Truncate long descriptions and evidence paths
- Add file path display with 📄 icon
Document the new Skills system introduced in recent commits:
- docs/architecture.md: Add Skills System section with discovery
priority, embedded skills, script extraction, and key types
- docs/skills.md: New comprehensive guide covering SKILL.md format,
discovery priority, embedded skills, research skill usage, and
troubleshooting
- README.md: Update Agent Skills section with correct priority order,
add embedded skills info, research skill usage, and link to Skills
Guide in Documentation Map
- AGENTS.md: Add skill creation to Adding Features, skill extraction
to Dangerous Code Paths, and new Skills System Entry Points section
All documentation links validated - no broken links or orphan files.
Agent: lamport
- Rewrite SKILL.md with inline instructions to spawn g3 --agent scout directly
- Extend read_file to handle embedded skill paths (<embedded:name>/SKILL.md)
- Remove scripts field from EmbeddedSkill struct (no longer needed)
- Delete extraction.rs module (was only for script extraction)
- Delete g3-research bash script
- Remove obsolete Async Research Tool section from workspace memory
Skills are now fully portable - they work when g3 is installed as a
binary without access to source files. Agents can read embedded skill
content via read_file with the special <embedded:...> path syntax.
- Remove is_embedded_skill() from discovery.rs (unused)
- Remove get_embedded_skills_map() from embedded.rs (unused)
- Remove associated tests for deleted functions
- Inline path check in test_repo_overrides_embedded test
This eliminates dead code warnings and reduces module surface area
without changing any behavior.
Agent: fowler
Focused analysis on past 10 commits covering:
- New skills module in g3-core (parser, discovery, prompt, embedded, extraction)
- Research tool externalized to skills/research/ skill
- SkillsConfig added to g3-config
- SDLC pipeline state moved to .g3/sdlc/
Key findings:
- 4 crates changed, 29 files affected (8 added, 2 deleted, 19 modified)
- No dependency cycles detected
- Clean DAG structure in new skills module
- Cross-crate coupling via g3-core::skills and g3-config::SkillsConfig
- Compile-time coupling to skills/research/ via include_str!
Agent: euler
When a Rust-only workspace was detected, the Language-Specific Guidance
header was appearing with no content because Rust has an empty prompt
string (agent-specific prompts handle Rust instead).
The fix filters out empty prompt strings in get_language_prompts_for_workspace()
so the header only appears when there's actual guidance content.
Added test to verify Rust-only workspaces return None.
- Web Research instructions now come from skills/research/SKILL.md
- Skills are dynamically loaded and injected via generate_skills_prompt()
- Remove test_both_prompts_have_web_research test (no longer applicable)
- Remove unused G3Status::research_complete() function
This completes the externalization of research as a skill.
Replaces the built-in research/research_status tools with a portable
skill-based approach:
- Add embedded skills infrastructure (skills compiled into binary)
- Add repo-local skills/ directory support (highest priority)
- Create research skill with SKILL.md and g3-research shell script
- Script extraction to .g3/bin/ with version tracking
- Filesystem-based handoff via .g3/research/<id>/status.json
- Remove PendingResearchManager and all research tool code
- Update system prompt to reference skill instead of tool
Benefits:
- No special tool infrastructure needed (just shell + read_file)
- Context-efficient (reports stay on disk until needed)
- Crash-resilient (state persisted to filesystem)
- Portable (skill can be overridden per-workspace)
Breaking change: research tool calls now return a deprecation message
pointing to the research skill.
The --resume flag was being ignored when --agent and --chat flags were
used together. The if-else chain checked for chat mode first and
immediately returned None, skipping the --resume check entirely.
Reordered the logic to check flags.resume first, ensuring explicit
--resume is always honored regardless of other flags.
Fixes: --resume not working with --agent --chat
- Add merge step before worktree cleanup when pipeline completes
- On success with commits: merge to main, then cleanup
- On failure: preserve worktree for debugging, print path
- On merge conflict: preserve worktree, print resolution instructions
- Move pipeline.json from analysis/sdlc/ to .g3/sdlc/ (gitignored)
- Remove the interactive prompt that asked users to resume in-progress sessions
- Remove unused new_session parameter from run_interactive()
- Remove unused info_inline() function from G3Status
- Explicit --resume <session_id> flag still works
CLI starts in plan mode by default (when not in agent mode), but was not
calling agent.set_plan_mode(true) at initialization. This meant the gate
check would not run until the user explicitly entered plan mode via /plan.
- Move system prompt for native tool calling models to prompts/system/native.md
- Use include_str! to embed at compile time
- Remove concatenated SHARED_* string constants
- Prompt is now readable/editable as a complete markdown document
- Non-native prompt still uses Rust constants (acceptable for now)
- Add in_plan_mode flag to Agent struct
- Add set_plan_mode() and is_plan_mode() methods
- Gate check now only runs when in_plan_mode is true
- CLI calls set_plan_mode(true) on /plan command and EnterPlanMode
- CLI calls set_plan_mode(false) on approval and CTRL-D exit
- Update integration test to enable plan mode
- Fix test YAML to use Vec<Check> for negative/boundary checks
Change Plan Mode to allow multiple negative and boundary checks per item,
while keeping happy path as a single check.
Schema change:
- checks.negative: Check -> Vec<Check> (>=1 required)
- checks.boundary: Check -> Vec<Check> (>=1 required)
- checks.happy: Check (unchanged, single)
This better reflects real-world tasks where there are often multiple
error conditions and edge cases worth tracking.
Changes:
- Update Checks struct to use Vec<Check> for negative/boundary
- Update validation to require at least 1 of each
- Update prompts and tool definitions with new array syntax
- Add 4 new tests for multi-check scenarios
- Add check_plan_approval_gate() in tools/plan.rs that runs after each tool call
- Detects file changes via git status --porcelain when plan exists but not approved
- Reverts changes: git checkout for modified files, rm for new untracked files
- Returns blocking message instructing LLM to create/approve plan first
- Add ApprovalGateResult enum with Allowed/Blocked/NotGitRepo variants
- Add set_session_id() and set_working_dir() methods on Agent for testing
- Add integration test using MockProvider to simulate blocked write_file
Implements a pipeline that orchestrates 7 g3 agents in sequence:
1. euler - dependency graph and hotspots analysis
2. breaker - whitebox exploration and edge-case discovery
3. hopper - deep testing and regression integrity
4. fowler - refactoring to deduplicate and reduce complexity
5. carmack - in-place rewriting for readability and concision
6. lamport - human-readable documentation and validation
7. huffman - semantic compression of memory
Features:
- Commit cursor tracking (--from flag to set starting point)
- Crash recovery (resumes from last incomplete stage)
- Git worktree isolation for all pipeline work
- Visual pipeline display with status icons
- Summary generation saved to .g3/sessions/sdlc/
- Pipeline state persisted to analysis/sdlc/pipeline.json
CLI:
- studio sdlc run [-c N] [--from COMMIT]
- studio sdlc status
- studio sdlc reset
Also adds huffman agent to embedded agents list.
- Add --resume CLI flag that conflicts with --new-session
- Add load_continuation_by_id() to load sessions by full or partial ID
- Support loading from latest.json or falling back to session.json
- Handle --resume in both normal and agent modes
- Agent mode validates session belongs to correct agent
Adds rulespec.yaml and envelope.yaml support for machine-readable
invariant checking during plan completion.
- Add invariants module with Rulespec, ActionEnvelope, and evaluation logic
- Add Invariants section to system prompt with workflow instructions
- Show rulespec/envelope file status in plan verification output
- Rulespec written during planning (captures constraints from task)
- Envelope written after implementation (documents what was built)
Implements the Agent Skills specification (https://agentskills.io) for
portable skill packages that give the agent new capabilities.
Changes:
- Add skills module with SKILL.md parser (YAML frontmatter + markdown body)
- Implement skill discovery from ~/.g3/skills/, config extra_paths, and .g3/skills/
- Generate <available_skills> XML for system prompt injection
- Add SkillsConfig to g3-config with enabled flag and extra_paths
- Wire skills discovery into CLI startup
- Add 29 unit tests for parser, discovery, and prompt generation
- Update README with Agent Skills documentation
Skill locations (priority order):
1. ~/.g3/skills/ (global)
2. Config extra_paths
3. .g3/skills/ (workspace, highest priority)
At startup, g3 scans skill directories and injects a summary into the
system prompt. When the agent needs a skill, it reads the full SKILL.md
using the read_file tool.
- Add auto-approval logic in execute_plan_write() when ctx.is_autonomous is true
- Update system prompt to document auto-approval behavior
- Plans still require explicit approval in interactive mode
Added plan_approve to the compact tool list in format_tool_result_summary()
so it displays in the same format as other tools like read_file and write_file.
The format_plan_approve_summary() function already existed but was never
called because plan_approve was missing from the matches! block.
- Start g3 in plan mode with ' >>' prompt and welcome message
- Add is_approval_input() to detect 'approve', 'a', 'yes', etc. and misspellings
- Allow trailing punctuation (!, ., ,) on approval words
- Call plan_approve tool directly without LLM when approval detected
- Add synthetic assistant message after approval for LLM context
- Exit plan mode after successful approval, return to 'g3>' prompt
- CTRL-D in plan mode exits plan mode first, then exits g3
- /plan command enters plan mode and shows welcome message
- Agent mode (--agent) does not start in plan mode
- Add CommandResult enum to signal plan mode entry from commands