Commit Graph

338 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
39e586982c feat: Externalize research tool as embedded skill
Replaces the built-in research/research_status tools with a portable
skill-based approach:

- Add embedded skills infrastructure (skills compiled into binary)
- Add repo-local skills/ directory support (highest priority)
- Create research skill with SKILL.md and g3-research shell script
- Script extraction to .g3/bin/ with version tracking
- Filesystem-based handoff via .g3/research/<id>/status.json
- Remove PendingResearchManager and all research tool code
- Update system prompt to reference skill instead of tool

Benefits:
- No special tool infrastructure needed (just shell + read_file)
- Context-efficient (reports stay on disk until needed)
- Crash-resilient (state persisted to filesystem)
- Portable (skill can be overridden per-workspace)

Breaking change: research tool calls now return a deprecation message
pointing to the research skill.
2026-02-05 13:23:26 +11:00
Dhanji R. Prasanna
89c071baf6 fix: honor --resume flag when used with --agent --chat
The --resume flag was being ignored when --agent and --chat flags were
used together. The if-else chain checked for chat mode first and
immediately returned None, skipping the --resume check entirely.

Reordered the logic to check flags.resume first, ensuring explicit
--resume is always honored regardless of other flags.

Fixes: --resume not working with --agent --chat
2026-02-05 13:05:48 +11:00
Dhanji R. Prasanna
0e64f13a8a Merge feature/agent-skills-support: Agent Skills specification support 2026-02-05 12:46:53 +11:00
Dhanji R. Prasanna
6228001bfc Remove automatic session resume suggestion on startup
- Remove the interactive prompt that asked users to resume in-progress sessions
- Remove unused new_session parameter from run_interactive()
- Remove unused info_inline() function from G3Status
- Explicit --resume <session_id> flag still works
2026-02-05 12:40:27 +11:00
Dhanji R. Prasanna
25ad198b83 Sync agent plan mode state on CLI startup
CLI starts in plan mode by default (when not in agent mode), but was not
calling agent.set_plan_mode(true) at initialization. This meant the gate
check would not run until the user explicitly entered plan mode via /plan.
2026-02-05 11:47:38 +11:00
Dhanji R. Prasanna
0f919237ea Make plan approval gate only active in plan mode
- Add in_plan_mode flag to Agent struct
- Add set_plan_mode() and is_plan_mode() methods
- Gate check now only runs when in_plan_mode is true
- CLI calls set_plan_mode(true) on /plan command and EnterPlanMode
- CLI calls set_plan_mode(false) on approval and CTRL-D exit
- Update integration test to enable plan mode
- Fix test YAML to use Vec<Check> for negative/boundary checks
2026-02-05 11:41:52 +11:00
Dhanji R. Prasanna
add8060526 Add studio sdlc command for SDLC maintenance pipeline
Implements a pipeline that orchestrates 7 g3 agents in sequence:
1. euler - dependency graph and hotspots analysis
2. breaker - whitebox exploration and edge-case discovery
3. hopper - deep testing and regression integrity
4. fowler - refactoring to deduplicate and reduce complexity
5. carmack - in-place rewriting for readability and concision
6. lamport - human-readable documentation and validation
7. huffman - semantic compression of memory

Features:
- Commit cursor tracking (--from flag to set starting point)
- Crash recovery (resumes from last incomplete stage)
- Git worktree isolation for all pipeline work
- Visual pipeline display with status icons
- Summary generation saved to .g3/sessions/sdlc/
- Pipeline state persisted to analysis/sdlc/pipeline.json

CLI:
- studio sdlc run [-c N] [--from COMMIT]
- studio sdlc status
- studio sdlc reset

Also adds huffman agent to embedded agents list.
2026-02-05 10:46:10 +11:00
Dhanji R. Prasanna
fdb1255f02 Add --resume <session-id> flag for explicit session resumption
- Add --resume CLI flag that conflicts with --new-session
- Add load_continuation_by_id() to load sessions by full or partial ID
- Support loading from latest.json or falling back to session.json
- Handle --resume in both normal and agent modes
- Agent mode validates session belongs to correct agent
2026-02-05 10:23:39 +11:00
Dhanji R. Prasanna
a5f6475603 feat: implement Agent Skills specification support
Implements the Agent Skills specification (https://agentskills.io) for
portable skill packages that give the agent new capabilities.

Changes:
- Add skills module with SKILL.md parser (YAML frontmatter + markdown body)
- Implement skill discovery from ~/.g3/skills/, config extra_paths, and .g3/skills/
- Generate <available_skills> XML for system prompt injection
- Add SkillsConfig to g3-config with enabled flag and extra_paths
- Wire skills discovery into CLI startup
- Add 29 unit tests for parser, discovery, and prompt generation
- Update README with Agent Skills documentation

Skill locations (priority order):
1. ~/.g3/skills/ (global)
2. Config extra_paths
3. .g3/skills/ (workspace, highest priority)

At startup, g3 scans skill directories and injects a summary into the
system prompt. When the agent needs a skill, it reads the full SKILL.md
using the read_file tool.
2026-02-04 12:58:57 +11:00
Dhanji R. Prasanna
f8448e5622 feat: Plan Mode interactive flow with approval shortcuts
- Start g3 in plan mode with ' >>' prompt and welcome message
- Add is_approval_input() to detect 'approve', 'a', 'yes', etc. and misspellings
- Allow trailing punctuation (!, ., ,) on approval words
- Call plan_approve tool directly without LLM when approval detected
- Add synthetic assistant message after approval for LLM context
- Exit plan mode after successful approval, return to 'g3>' prompt
- CTRL-D in plan mode exits plan mode first, then exits g3
- /plan command enters plan mode and shows welcome message
- Agent mode (--agent) does not start in plan mode
- Add CommandResult enum to signal plan mode entry from commands
2026-02-02 16:59:52 +11:00
Dhanji R. Prasanna
9024f693fa Fix plan tool UI formatting
- Fix vertical bar continuation: │ continues all the way down, only the
  very last sub-line (boundary of last item) gets └
- Add visual gap before plan file path and change 📄 to ->
- Dedent file path to align with tree root
- Fix plan_approve to use proper compact tool format (was missing from
  is_compact_tool matches! in print_tool_compact, causing it to fall
  through to regular output with | prefix)
2026-02-02 16:29:37 +11:00
Dhanji R. Prasanna
e893794029 Rename /feature command to /plan
- Update command matching from /feature to /plan in commands.rs
- Update help text, usage message, and example
- Update workspace memory references
- /feature is no longer recognized (completely removed)
2026-02-02 16:00:09 +11:00
Dhanji R. Prasanna
8705228fda Fix input formatter bugs: apostrophe highlighting and line duplication
Fixes two bugs in the input formatter:

1. Single/double quote regex now requires word boundaries:
   - Contractions like it's, don't, won't no longer trigger highlighting
   - Only properly quoted text like 'special' or "hello" gets cyan
   - Mixed input like "it's a 'test' case" only highlights 'test'

2. Visual line calculation fix for exact terminal width:
   - When text exactly fills terminal width, cursor wraps to next line
   - Added +1 adjustment to account for this edge case
   - Extracted calculate_visual_lines() for testability

Added 9 new tests covering all edge cases.
2026-02-02 15:54:38 +11:00
Dhanji R. Prasanna
571188305a feat: add compact UI output for Plan Mode tools
Plan tools (plan_read, plan_write) now display with elegant tree-style
formatting similar to the old todo_write UI:

- State indicators: □ (todo), ◐ (doing), ■ (done), ⊘ (blocked)
- Tree prefixes (├/└) for items with child details
- Strikethrough for completed items
- Shows touches and all three checks (happy/negative/boundary)
- Displays plan file path link at the end

plan_approve uses compact single-line format like read_file:
- Shows approval status and revision number
- Handles already-approved and error cases

Changes:
- Add print_plan_compact() to UiWriter trait with default impl
- Implement print_plan_compact() in ConsoleUiWriter
- Call print_plan_compact() from execute_plan_read/write
- Add plan_read/plan_write to is_self_handled_tool()
- Add plan_approve to is_compact_tool() with format_plan_approve_summary()
- Add serde_yaml dependency to g3-cli
2026-02-02 15:30:05 +11:00
Dhanji R. Prasanna
a63950d8f5 Add Plan Mode to replace TODO system
Plan Mode is a cognitive forcing system that requires reasoning about:
- Happy path
- Negative case
- Boundary condition

New tools:
- plan_read: Read current plan for session
- plan_write: Create/update plan with YAML content (validates structure)
- plan_approve: Mark current revision as approved

New command:
- /feature <description>: Start Plan Mode for a new feature

Plan schema requires:
- plan_id, revision, approved_revision
- items with id, description, state, touches, checks (happy/negative/boundary)
- evidence and notes required when marking items done

Verification:
- plan_verify() called automatically when all items are done/blocked

Removed:
- todo_read, todo_write tools
- todo.rs module and related tests
2026-02-02 14:38:25 +11:00
Dhanji R. Prasanna
afc5bc8574 Readability improvements across streaming_parser, input_formatter, commands
- streaming_parser.rs: Reduced ~70 lines by removing redundant comments,
  consolidating doc comments, using slice syntax for TOOL_CALL_PATTERNS
- input_formatter.rs: Lazy regex compilation via once_cell (performance),
  cleaner function structure, reduced comment noise
- commands.rs: Extracted format_research_task_summary() and
  format_research_report_header() helpers, reduced ~40 lines of duplication
- pending_research.rs: Fixed 2 unused variable warnings in tests

All changes are behavior-preserving. 446 tests pass.

Agent: carmack
2026-01-30 14:48:08 +11:00
Dhanji R. Prasanna
6bb07ce4f5 Merge sessions/interactive/3c2a09df 2026-01-30 14:20:12 +11:00
Dhanji R. Prasanna
f1a5241777 Add /research <id> and /research latest commands
Allow users to view research reports directly from the CLI:

- /research - List all research tasks (unchanged)
- /research <id> - View the full report for a specific research task
- /research latest - View the most recent completed research report

Report display includes query, status, elapsed time, and full content.
2026-01-30 14:06:28 +11:00
Dhanji R. Prasanna
f93d05f444 Add real-time research completion notifications
When background research completes, g3 now immediately prints a status
message instead of waiting for the next user interaction:

- Added ResearchCompletionNotification and broadcast channel to
  PendingResearchManager for push-based notifications
- Added spawn_research_notification_handler() in interactive mode that
  listens for completions in a background task
- When idle (at prompt): clears line, prints status, reprints prompt
- When busy (processing): prints status inline (interleaving is fine)
- Added G3Status::research_complete() for consistent formatting
- Added enable_research_notifications() method to Agent

Output format: "g3: 1 research report ... [done]"
2026-01-30 13:35:35 +11:00
Dhanji R. Prasanna
5428504777 Fix input formatting bugs: newline, line wrapping, and TTY check
Fixes three bugs in the input formatter introduced in 4e16942:

1. Bug 2 & 3 (missing newline, line duplication):
   - Changed print! to println! to add trailing newline
   - Calculate visual lines based on terminal width instead of
     logical line count, fixing duplication for wrapped lines

2. Bug 1 (^M on non-interactive prompts):
   - Added TTY check to skip formatting when stdout is not a terminal
   - Prevents terminal state corruption for stdin prompts
2026-01-30 13:28:31 +11:00
Dhanji R. Prasanna
b252ff443d Merge sessions/interactive/9681cb67 2026-01-30 13:01:00 +11:00
Dhanji R. Prasanna
5ab1598e03 feat: async research tool - runs in background, returns immediately
The research tool now spawns the scout agent in a background tokio task
and returns immediately with a research_id placeholder. This allows the
agent to continue working while research runs (30-120 seconds).

Key changes:
- New PendingResearchManager for tracking async research tasks
- research tool returns immediately with placeholder containing research_id
- research_status tool to check progress of pending research
- Auto-injection of completed research at natural break points:
  - Start of each tool iteration (before LLM call)
  - Before prompting user in interactive mode
- /research CLI command to list all research tasks
- Updated system prompt to explain async behavior

The agent can:
- Continue with other work while research runs
- Check status with research_status tool
- Yield turn to user if results are critical before continuing
2026-01-30 13:00:02 +11:00
Dhanji R. Prasanna
4e1694248f Add input formatting for interactive CLI
When users type prompts in interactive mode, the input is now
reformatted in place with enhanced highlighting:

- ALL CAPS words (2+ chars) become bold green (e.g., FIX, BUG, HTTP2)
- Quoted text ("..." or ...) becomes cyan
- Standard markdown formatting is also supported

New module: input_formatter.rs with 10 unit tests
Integrated into interactive.rs for both single-line and multiline input
2026-01-30 12:03:36 +11:00
Dhanji R. Prasanna
2e21502357 Fix --project flag not working in agent mode
- Add CommonFlags struct to group flags that apply across all modes
- Refactor run_agent_mode() to accept CommonFlags instead of individual params
- Add project loading logic for agent chat mode
- Add integration tests for --project with agent mode

This refactor prevents future bugs where new flags work in one mode
but are forgotten in another.
2026-01-30 11:28:48 +11:00
Dhanji R. Prasanna
5c1e0630b5 Merge sessions/interactive/664ee473 2026-01-29 11:14:28 +11:00
Dhanji R. Prasanna
7bfb9efa19 Remove automatic README loading from context window
README.md is no longer auto-loaded into the LLM context at startup.
This saves ~4,600 tokens per session while AGENTS.md and memory.md
still provide all critical information for code tasks.

Changes:
- Delete read_project_readme() function
- Remove readme_content parameter from combine_project_content()
- Rename extract_readme_heading() -> extract_project_heading()
- Rename Agent constructors: *_with_readme_* -> *_with_project_context_*
- Update context preservation to only check for Agent Configuration
- Remove has_readme field from LoadedContent
- Update all tests to use new markers and function names

The LLM can still read README.md on-demand via read_file when needed.
2026-01-29 11:07:41 +11:00
Dhanji R. Prasanna
5ea43d7b39 Add --project CLI flag for loading projects at startup
Adds a new --project <PATH> flag that loads project files (brief.md,
contacts.yaml, status.md) at startup, similar to the /project command
but WITHOUT auto-executing the project status prompt.

Changes:
- Add --project flag to cli_args.rs
- Add load_and_validate_project() helper in project.rs (shared by both
  --project flag and /project command)
- Modify run_interactive() to accept optional initial_project parameter
- Wire up --project in lib.rs to load project before interactive mode
- Refactor /project command to use shared helper (reduces duplication)
- Add 4 new tests for load_and_validate_project()
2026-01-29 11:06:08 +11:00
Dhanji R. Prasanna
735e9c9312 Add Google Gemini provider support
- Add GeminiProvider with streaming and native tool calling
- Support gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-pro/flash models
- Model-specific context window detection (1M-2M tokens)
- Message conversion: assistant -> model role mapping
- System messages extracted to system_instruction field
- Tool schema conversion with functionCall/functionResponse parts
- SSE streaming with JSON array buffer parsing
- 8 unit tests for conversion and parsing logic
- Register provider in g3-core and validate in g3-cli
2026-01-29 10:11:42 +11:00
Dhanji R. Prasanna
e32c302023 Fix embedded provider initialization and logging
- Use global OnceLock for llama.cpp backend to prevent BackendAlreadyInitialized error
- Suppress verbose llama.cpp stderr logging during model loading
- Fix provider validation to accept "embedded.name" format (extract type before dot)
2026-01-28 10:33:10 +11:00
Dhanji R. Prasanna
755acabd47 Highlight command argument completions in cyan
- /run path completions shown in cyan
- /resume session ID completions shown in cyan
- /project name completions shown in cyan
2026-01-27 12:45:37 +11:00
Dhanji R. Prasanna
8389b0d652 Add TAB autocompletion for /project command
- Complete project names from ~/projects/ directory
- Display shows project name, replacement uses ~/projects/<name> path
- Projects sorted alphabetically
- Added test for project completion
2026-01-27 12:43:24 +11:00
Dhanji R. Prasanna
d6a986ce0f refactor(cli): extract execute_user_input() to eliminate duplication
Both multiline and single-line input paths in interactive.rs had identical
code for:
- Template processing (process_template)
- Task execution (execute_task_with_retry)
- Auto-memory reminder with error handling

Extracted to a single execute_user_input() helper function that handles
all three steps. This eliminates code-path aliasing where the two paths
could drift over time.

File reduced from 401 to 393 lines (-2%).
All 106 g3-cli tests pass.

Agent: fowler
2026-01-26 15:59:55 +11:00
Dhanji R. Prasanna
57f04a77aa Add template expansion to interactive prompts
Apply {{today}} and other template variables to user input in:
- Interactive mode (single and multiline)
- Accumulative mode requirements
2026-01-26 15:43:39 +11:00
Dhanji R. Prasanna
7806897f00 Expand {{today}} to include day of week: YYYY-MM-DD (Monday) 2026-01-26 15:29:47 +11:00
Dhanji R. Prasanna
83f68dae17 style: convert CLI status messages to G3Status format
Convert remaining  emoji status messages in g3-cli to use the
consistent G3Status formatting system:

- accumulative.rs: 'autonomous run ... [done]'
- commands.rs /clear: 'clearing session ... [done]'
- commands.rs /readme: 'reloading README ... [done/failed/error]'
- commands.rs /unproject: 'unloading project ... [done]'

This provides a consistent 'g3: action ... [status]' format across
all CLI status messages.
2026-01-23 10:08:22 +05:30
Dhanji R. Prasanna
155db74aac style: use G3Status formatting for agent mode completion message
Change agent mode completion from ' Agent mode completed' to
'g3: <agent-name> session ... [done]' for consistency with other
g3 status messages.
2026-01-23 10:04:05 +05:30
Dhanji R. Prasanna
dfdc21c3cf Use G3Status formatting for /project loading message
Changed from 'Project loaded: ✓ file1  ✓ file2' to
'g3: loading <project-name> .. ✓ file1  ✓ file2 .. [done]'

- Add G3Status::loading_project() for consistent status formatting
- Update /project command to use new formatting
- Remove unused crossterm imports from commands.rs
2026-01-22 21:03:46 +05:30
Dhanji R. Prasanna
a488a6aa99 feat(cli): colorize project name in prompt via rustyline Highlighter
Implement highlight_prompt() in G3Helper to colorize the project portion
of the prompt in blue. This uses rustyline's proper mechanism for ANSI
codes in prompts, which correctly handles cursor positioning.

Prompt 'butler | finances> ' now shows '| finances>' in blue.
2026-01-22 10:48:17 +05:30
Dhanji R. Prasanna
067c69723b fix(cli): use plain text prompt without ANSI colors
ANSI color codes in rustyline prompts cause various issues:
- \x01...\x02 markers break cursor movement
- Separate prefix printing causes gaps or disappearing text

Simplified to plain text prompt: 'butler | finances> '
This ensures reliable cursor positioning and tab completion.
2026-01-22 10:27:27 +05:30
Dhanji R. Prasanna
cb1f99c41c Revert "fix(cli): use '> ' as readline prompt when project active"
This reverts commit 4d9399f737.
2026-01-22 10:24:21 +05:30
Dhanji R. Prasanna
4d9399f737 fix(cli): use '> ' as readline prompt when project active
Previously used empty string as readline prompt after printing colored
prefix, which caused cursor positioning issues (large gap between
project name and cursor).

Now the prefix contains 'butler | finances' (colored) and readline
gets '> ' as its prompt, so cursor appears immediately after '> '.
2026-01-22 10:18:15 +05:30
Dhanji R. Prasanna
28dd60d4fc fix(cli): separate colored prefix from readline prompt
Rustyline's \x01...\x02 markers for ANSI codes didn't work correctly,
causing cursor positioning issues and breaking line editing.

New approach: build_prompt() returns (prefix, prompt) tuple where:
- prefix: colored text printed before readline (contains ANSI codes)
- prompt: plain text passed to readline (no ANSI codes)

This ensures rustyline correctly calculates line length while still
showing the colored project name.
2026-01-22 09:59:52 +05:30
Dhanji R. Prasanna
be35fa2a7f fix(cli): wrap ANSI codes in prompt for rustyline compatibility
Rustyline needs ANSI escape codes wrapped in \x01...\x02 markers
to correctly calculate visible prompt length. Without this, tab
completion breaks because rustyline miscalculates cursor position.
2026-01-22 08:30:30 +05:30
Dhanji R. Prasanna
3001df3b1a style(cli): simplify project prompt format
Change from: butler |[finances]>
Change to:   butler | finances>
2026-01-22 08:15:18 +05:30
Dhanji R. Prasanna
022f5c70a6 feat(cli): show active project name in interactive prompt
When a project is loaded via /project, the prompt now shows:
  agent_name |[project_name]>

where the |[project_name]> part is displayed in blue.

Examples:
- Default: g3>
- With project: g3 |[myapp]>
- Agent mode: butler>
- Agent + project: butler |[myapp]>

The prompt automatically resets when /unproject is called.

Added build_prompt() function with 7 unit tests covering all prompt states.
2026-01-22 07:24:00 +05:30
Dhanji R. Prasanna
9325a43ff3 feat(cli): shorten file paths in tool output display
Add three-level path shortening hierarchy for cleaner CLI output:
1. Project path -> <project_name>/... (when project loaded via /project)
2. Workspace path -> ./... (relative to current working directory)
3. Home path -> ~/... (fallback for paths under home directory)

Changes:
- Add shorten_path() and shorten_paths_in_command() functions in display.rs
- Add project_path/project_name fields to ConsoleUiWriter
- Add set_workspace_path(), set_project_path(), clear_project() to UiWriter trait
- Add ui_writer() getter to Agent struct
- Wire up project path setting in /project and /unproject commands
- Set workspace path when creating agents in all CLI modes

Before: ● read_file | /Users/dhanji/icloud/butler/projects/appa_estate/status.md
After:  ● read_file | appa_estate/status.md (with project loaded)
        ● read_file | ./src/main.rs (workspace-relative)
        ● read_file | ~/Documents/file.txt (home-relative)
2026-01-21 21:27:16 +05:30
Dhanji R. Prasanna
d7d32db4a4 Fix tab completion in agent+chat mode
Remove duplicate logging initialization in agent_mode.rs. Logging is already
initialized in run() before agent mode is dispatched. The duplicate
tracing_subscriber::fmt::layer() was interfering with rustyline's terminal
state, breaking tab completion.
2026-01-21 15:24:27 +05:30
Dhanji R. Prasanna
581de4845c Add /project and /unproject to tab completion 2026-01-21 14:58:23 +05:30
Dhanji R. Prasanna
feb7c3e40d Add /project and /unproject commands for project-specific context
- Add Project struct in crates/g3-cli/src/project.rs with file loading logic
- Load brief.md, contacts.yaml, status.md from project path
- Load projects.md from workspace root for cross-project context
- Project content appended to system message (survives compaction/dehydration)
- /project <path> loads project and auto-submits prompt asking about state
- /unproject clears project content and resets context
- Add set_project_content(), clear_project_content(), has_project_content() to Agent
- Add new_for_test_with_readme() for testing with custom README content
- Add 6 unit tests for Project struct
- Add 9 integration tests for project context behavior
2026-01-21 14:53:30 +05:30
Dhanji R. Prasanna
a34a3b08e9 Rename Project Memory to Workspace Memory
Rename all references from "Project Memory" to "Workspace Memory" to avoid
future conflation if a "project" concept is introduced later.

Changes:
- Rename read_project_memory() -> read_workspace_memory()
- Update all prompts, tool descriptions, and comments
- Update header parsing in memory.rs to use "# Workspace Memory"
- Update display detection for "=== Workspace Memory ==="
- Update documentation and analysis/memory.md

11 files changed, ~36 occurrences updated.
2026-01-21 14:08:42 +05:30