Switch compact tool display to use bold ANSI colors (TOOL_COLOR_*_BOLD)
for tool names, matching the non-compact tool header style.
Affected: print_tool_compact, print_todo_compact, print_plan_compact,
and the streaming hint indicator (ParsingHintState::handle_hint).
Remove now-unused non-bold constants TOOL_COLOR_NORMAL and TOOL_COLOR_AGENT.
When streaming markdown headers containing inline tags (backticks, bold,
italic), the closing delimiter triggered early emission via
emit_formatted_inline(). Since format_header() appends a newline, any
text after the closing tag ended up on a separate line.
Added an in_header guard to handle_delimiter() so headers wait for the
actual newline to emit as a complete line. Added 4 char-by-char streaming
tests covering the bug pattern.
- Remove all file revert/delete logic from check_plan_approval_gate:
no more git checkout or fs::remove_file calls. The gate only warns.
- Remove reverted_files field from ApprovalGateResult::Blocked.
- Add get_dirty_files() helper to snapshot dirty files as a HashSet.
- Capture baseline dirty files when plan mode starts (set_plan_mode).
Pre-existing dirty files are excluded from gate checks so they
never trigger blocking.
- Add 5 new unit tests covering non-destructive behavior, baseline
exclusion, and mixed baseline/new file scenarios.
- Update integration test to match new non-destructive semantics.
Add load_toolset to the compact tool lists in streaming.rs and
ui_writer_impl.rs so it renders as a single-line summary instead of
the full multi-line tool definitions output.
Summary format:
🧩 loaded '<name>' — on success
ℹ️ already loaded — when already loaded
❌ failed — on error
The toolset name arg is extracted as display_arg in print_tool_compact
so it appears in the compact output line.
Features:
- New predicate rules: NotContains, AnyOf, NoneOf
- Conditional predicates via when clauses (WhenCondition/CompiledWhenCondition)
- Null handling: YAML null treated as absent for exists/not_exists
- Solon agent for rulespec authoring (agents/solon.md)
- Rulespec schema documentation (prompts/schemas/rulespec.schema.md)
Bugfix:
- Fixed when condition evaluation in datalog path: catch-all branch did
naive string contains instead of delegating to evaluate_predicate_datalog().
Rules like matches (regex) were silently ignored, causing vacuous pass
and letting violations through. Now delegates to evaluate_predicate_datalog()
which handles all 12 rule types correctly.
Tests: 34 new tests covering all new rules, null handling, when conditions,
and the when+matches bugfix (butler rulespec pattern).
Let approval input flow through the LLM instead of being
short-circuited in the REPL. The LLM calls plan_approve
itself, which is cleaner (single input path) and more
flexible (no hardcoded misspelling list).
The /project command was auto-invoking a status report ("what is the
current state of the project?") as the first user message after loading
project files. This was inconsistent with the --project flag behavior,
which only loads files and displays status without auto-prompting.
Removed the auto-submit lines so /project now behaves identically to
the --project CLI flag: load files, set context, display status, done.
- Enable custom-bindings feature in rustyline
- Bind Alt+Enter to insert newlines in interactive and accumulative modes
- Update calculate_visual_lines() to handle embedded newlines correctly
- Add tests for multiline visual line calculation
Note: Shift+Enter is not distinguishable in standard terminals, so Alt+Enter
is used as the multiline input trigger.
- Change plan mode prompt from ' >> ' to ' [plan mode] >> ' for clarity
- Add magenta syntax highlighting for [plan mode] text in prompt
- Add tests for prompt highlighting behavior
The loaded status line (✓ AGENTS.md ✓ Memory) already indicates that
AGENTS.md was loaded, so the separate '>> AGENTS.md - Machine Instructions'
heading line was redundant.
- Remove print_project_heading() function from display.rs
- Remove extract_project_heading call from interactive.rs
- Clean up unused imports
The shell tool output line was wrapping because update_tool_output_line
clipped the content without reserving space for the suffix that gets
appended later (line count + timing info).
Added suffix_overhead of 30 chars for shell tools to reserve space for:
- " (9999 lines)" = ~13 chars
- " | 99999 ◉ 999ms" = ~17 chars
This ensures the complete line fits within terminal width without wrapping.
Clip summary text and other long fields to fit terminal width:
- Clip display_summary in print_tool_compact (e.g., "47 lines (2.0k chars)")
- Account for header_suffix length when compressing paths in print_tool_output_header
- Clip TODO item lines in print_todo_compact
- Clip plan item descriptions, evidence, touches, checks, and paths in print_plan_compact
- Replace hardcoded 70/40 char limits with dynamic terminal-width-based clipping
All clipping uses clip_line() which handles UTF-8 safely and adds ellipsis.
When a plan reaches a terminal state (all items done or blocked) in
interactive mode, automatically exit plan mode and return to normal
prompt.
Changes:
- Add Agent::is_plan_terminal() method to check if plan is complete
- Add check_and_exit_plan_mode_if_terminal() helper in interactive.rs
- Call the helper after each execute_user_input() to detect completion
Fixes issue where plan mode prompt ' >> ' persisted after plan completion.
- Add terminal_width module with get_terminal_width(), clip_line(),
compress_path(), and compress_command() utilities
- Update ConsoleUiWriter to use dynamic terminal width for all tool output
- Tool output lines are clipped to fit without wrapping
- Tool headers use semantic compression (paths preserve filename,
commands clip from right)
- 4-character right margin for visual clarity
- Minimum 40 columns, default 80 when terminal size unavailable
- All truncation is UTF-8 safe (char counting, not byte slicing)
- Add 13 unit tests for terminal width utilities
- Fix build warnings: add #[allow(dead_code)] to unused deserialization fields
- Fix plan approval gate bug: block file changes when no plan exists (not just
when plan exists but is unapproved)
- Add "Create a plan: " prefix to first user message in plan mode
- Add prepare_plan_mode_input() helper function for testability
- Reset is_first_plan_message flag when entering plan mode via /plan command
- Add tests for approval gate (no plan + no changes, no plan + changes)
- Add tests for prepare_plan_mode_input (happy, negative, boundary cases)
- Update ChecksCompact to use Vec<CheckCompact> for negative/boundary fields
- Add progress bar visualization showing done/doing/blocked/todo counts
- Show evidence for done items, checks for active items
- Display all negative and boundary checks (not just first)
- Add proper tree structure with └/├ prefixes
- Truncate long descriptions and evidence paths
- Add file path display with 📄 icon
When a Rust-only workspace was detected, the Language-Specific Guidance
header was appearing with no content because Rust has an empty prompt
string (agent-specific prompts handle Rust instead).
The fix filters out empty prompt strings in get_language_prompts_for_workspace()
so the header only appears when there's actual guidance content.
Added test to verify Rust-only workspaces return None.
- Web Research instructions now come from skills/research/SKILL.md
- Skills are dynamically loaded and injected via generate_skills_prompt()
- Remove test_both_prompts_have_web_research test (no longer applicable)
- Remove unused G3Status::research_complete() function
This completes the externalization of research as a skill.
Replaces the built-in research/research_status tools with a portable
skill-based approach:
- Add embedded skills infrastructure (skills compiled into binary)
- Add repo-local skills/ directory support (highest priority)
- Create research skill with SKILL.md and g3-research shell script
- Script extraction to .g3/bin/ with version tracking
- Filesystem-based handoff via .g3/research/<id>/status.json
- Remove PendingResearchManager and all research tool code
- Update system prompt to reference skill instead of tool
Benefits:
- No special tool infrastructure needed (just shell + read_file)
- Context-efficient (reports stay on disk until needed)
- Crash-resilient (state persisted to filesystem)
- Portable (skill can be overridden per-workspace)
Breaking change: research tool calls now return a deprecation message
pointing to the research skill.
The --resume flag was being ignored when --agent and --chat flags were
used together. The if-else chain checked for chat mode first and
immediately returned None, skipping the --resume check entirely.
Reordered the logic to check flags.resume first, ensuring explicit
--resume is always honored regardless of other flags.
Fixes: --resume not working with --agent --chat
- Remove the interactive prompt that asked users to resume in-progress sessions
- Remove unused new_session parameter from run_interactive()
- Remove unused info_inline() function from G3Status
- Explicit --resume <session_id> flag still works
CLI starts in plan mode by default (when not in agent mode), but was not
calling agent.set_plan_mode(true) at initialization. This meant the gate
check would not run until the user explicitly entered plan mode via /plan.
- Add in_plan_mode flag to Agent struct
- Add set_plan_mode() and is_plan_mode() methods
- Gate check now only runs when in_plan_mode is true
- CLI calls set_plan_mode(true) on /plan command and EnterPlanMode
- CLI calls set_plan_mode(false) on approval and CTRL-D exit
- Update integration test to enable plan mode
- Fix test YAML to use Vec<Check> for negative/boundary checks
Implements a pipeline that orchestrates 7 g3 agents in sequence:
1. euler - dependency graph and hotspots analysis
2. breaker - whitebox exploration and edge-case discovery
3. hopper - deep testing and regression integrity
4. fowler - refactoring to deduplicate and reduce complexity
5. carmack - in-place rewriting for readability and concision
6. lamport - human-readable documentation and validation
7. huffman - semantic compression of memory
Features:
- Commit cursor tracking (--from flag to set starting point)
- Crash recovery (resumes from last incomplete stage)
- Git worktree isolation for all pipeline work
- Visual pipeline display with status icons
- Summary generation saved to .g3/sessions/sdlc/
- Pipeline state persisted to analysis/sdlc/pipeline.json
CLI:
- studio sdlc run [-c N] [--from COMMIT]
- studio sdlc status
- studio sdlc reset
Also adds huffman agent to embedded agents list.
- Add --resume CLI flag that conflicts with --new-session
- Add load_continuation_by_id() to load sessions by full or partial ID
- Support loading from latest.json or falling back to session.json
- Handle --resume in both normal and agent modes
- Agent mode validates session belongs to correct agent
Implements the Agent Skills specification (https://agentskills.io) for
portable skill packages that give the agent new capabilities.
Changes:
- Add skills module with SKILL.md parser (YAML frontmatter + markdown body)
- Implement skill discovery from ~/.g3/skills/, config extra_paths, and .g3/skills/
- Generate <available_skills> XML for system prompt injection
- Add SkillsConfig to g3-config with enabled flag and extra_paths
- Wire skills discovery into CLI startup
- Add 29 unit tests for parser, discovery, and prompt generation
- Update README with Agent Skills documentation
Skill locations (priority order):
1. ~/.g3/skills/ (global)
2. Config extra_paths
3. .g3/skills/ (workspace, highest priority)
At startup, g3 scans skill directories and injects a summary into the
system prompt. When the agent needs a skill, it reads the full SKILL.md
using the read_file tool.
- Start g3 in plan mode with ' >>' prompt and welcome message
- Add is_approval_input() to detect 'approve', 'a', 'yes', etc. and misspellings
- Allow trailing punctuation (!, ., ,) on approval words
- Call plan_approve tool directly without LLM when approval detected
- Add synthetic assistant message after approval for LLM context
- Exit plan mode after successful approval, return to 'g3>' prompt
- CTRL-D in plan mode exits plan mode first, then exits g3
- /plan command enters plan mode and shows welcome message
- Agent mode (--agent) does not start in plan mode
- Add CommandResult enum to signal plan mode entry from commands
- Fix vertical bar continuation: │ continues all the way down, only the
very last sub-line (boundary of last item) gets └
- Add visual gap before plan file path and change 📄 to ->
- Dedent file path to align with tree root
- Fix plan_approve to use proper compact tool format (was missing from
is_compact_tool matches! in print_tool_compact, causing it to fall
through to regular output with | prefix)
- Update command matching from /feature to /plan in commands.rs
- Update help text, usage message, and example
- Update workspace memory references
- /feature is no longer recognized (completely removed)
Fixes two bugs in the input formatter:
1. Single/double quote regex now requires word boundaries:
- Contractions like it's, don't, won't no longer trigger highlighting
- Only properly quoted text like 'special' or "hello" gets cyan
- Mixed input like "it's a 'test' case" only highlights 'test'
2. Visual line calculation fix for exact terminal width:
- When text exactly fills terminal width, cursor wraps to next line
- Added +1 adjustment to account for this edge case
- Extracted calculate_visual_lines() for testability
Added 9 new tests covering all edge cases.
Plan tools (plan_read, plan_write) now display with elegant tree-style
formatting similar to the old todo_write UI:
- State indicators: □ (todo), ◐ (doing), ■ (done), ⊘ (blocked)
- Tree prefixes (├/└) for items with child details
- Strikethrough for completed items
- Shows touches and all three checks (happy/negative/boundary)
- Displays plan file path link at the end
plan_approve uses compact single-line format like read_file:
- Shows approval status and revision number
- Handles already-approved and error cases
Changes:
- Add print_plan_compact() to UiWriter trait with default impl
- Implement print_plan_compact() in ConsoleUiWriter
- Call print_plan_compact() from execute_plan_read/write
- Add plan_read/plan_write to is_self_handled_tool()
- Add plan_approve to is_compact_tool() with format_plan_approve_summary()
- Add serde_yaml dependency to g3-cli
Plan Mode is a cognitive forcing system that requires reasoning about:
- Happy path
- Negative case
- Boundary condition
New tools:
- plan_read: Read current plan for session
- plan_write: Create/update plan with YAML content (validates structure)
- plan_approve: Mark current revision as approved
New command:
- /feature <description>: Start Plan Mode for a new feature
Plan schema requires:
- plan_id, revision, approved_revision
- items with id, description, state, touches, checks (happy/negative/boundary)
- evidence and notes required when marking items done
Verification:
- plan_verify() called automatically when all items are done/blocked
Removed:
- todo_read, todo_write tools
- todo.rs module and related tests
Allow users to view research reports directly from the CLI:
- /research - List all research tasks (unchanged)
- /research <id> - View the full report for a specific research task
- /research latest - View the most recent completed research report
Report display includes query, status, elapsed time, and full content.
When background research completes, g3 now immediately prints a status
message instead of waiting for the next user interaction:
- Added ResearchCompletionNotification and broadcast channel to
PendingResearchManager for push-based notifications
- Added spawn_research_notification_handler() in interactive mode that
listens for completions in a background task
- When idle (at prompt): clears line, prints status, reprints prompt
- When busy (processing): prints status inline (interleaving is fine)
- Added G3Status::research_complete() for consistent formatting
- Added enable_research_notifications() method to Agent
Output format: "g3: 1 research report ... [done]"
Fixes three bugs in the input formatter introduced in 4e16942:
1. Bug 2 & 3 (missing newline, line duplication):
- Changed print! to println! to add trailing newline
- Calculate visual lines based on terminal width instead of
logical line count, fixing duplication for wrapped lines
2. Bug 1 (^M on non-interactive prompts):
- Added TTY check to skip formatting when stdout is not a terminal
- Prevents terminal state corruption for stdin prompts
The research tool now spawns the scout agent in a background tokio task
and returns immediately with a research_id placeholder. This allows the
agent to continue working while research runs (30-120 seconds).
Key changes:
- New PendingResearchManager for tracking async research tasks
- research tool returns immediately with placeholder containing research_id
- research_status tool to check progress of pending research
- Auto-injection of completed research at natural break points:
- Start of each tool iteration (before LLM call)
- Before prompting user in interactive mode
- /research CLI command to list all research tasks
- Updated system prompt to explain async behavior
The agent can:
- Continue with other work while research runs
- Check status with research_status tool
- Yield turn to user if results are critical before continuing
When users type prompts in interactive mode, the input is now
reformatted in place with enhanced highlighting:
- ALL CAPS words (2+ chars) become bold green (e.g., FIX, BUG, HTTP2)
- Quoted text ("..." or ...) becomes cyan
- Standard markdown formatting is also supported
New module: input_formatter.rs with 10 unit tests
Integrated into interactive.rs for both single-line and multiline input
- Add CommonFlags struct to group flags that apply across all modes
- Refactor run_agent_mode() to accept CommonFlags instead of individual params
- Add project loading logic for agent chat mode
- Add integration tests for --project with agent mode
This refactor prevents future bugs where new flags work in one mode
but are forgotten in another.
README.md is no longer auto-loaded into the LLM context at startup.
This saves ~4,600 tokens per session while AGENTS.md and memory.md
still provide all critical information for code tasks.
Changes:
- Delete read_project_readme() function
- Remove readme_content parameter from combine_project_content()
- Rename extract_readme_heading() -> extract_project_heading()
- Rename Agent constructors: *_with_readme_* -> *_with_project_context_*
- Update context preservation to only check for Agent Configuration
- Remove has_readme field from LoadedContent
- Update all tests to use new markers and function names
The LLM can still read README.md on-demand via read_file when needed.
Adds a new --project <PATH> flag that loads project files (brief.md,
contacts.yaml, status.md) at startup, similar to the /project command
but WITHOUT auto-executing the project status prompt.
Changes:
- Add --project flag to cli_args.rs
- Add load_and_validate_project() helper in project.rs (shared by both
--project flag and /project command)
- Modify run_interactive() to accept optional initial_project parameter
- Wire up --project in lib.rs to load project before interactive mode
- Refactor /project command to use shared helper (reduces duplication)
- Add 4 new tests for load_and_validate_project()
- Add GeminiProvider with streaming and native tool calling
- Support gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-pro/flash models
- Model-specific context window detection (1M-2M tokens)
- Message conversion: assistant -> model role mapping
- System messages extracted to system_instruction field
- Tool schema conversion with functionCall/functionResponse parts
- SSE streaming with JSON array buffer parsing
- 8 unit tests for conversion and parsing logic
- Register provider in g3-core and validate in g3-cli