alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	5085f10717	Merge sessions/interactive/07eabd99	2026-02-07 12:29:56 +11:00
Dhanji R. Prasanna	afaee8816c	tweak to system prompt	2026-02-06 20:32:19 +11:00
Dhanji R. Prasanna	14112ff92e	Remove client-side plan approval interception Let approval input flow through the LLM instead of being short-circuited in the REPL. The LLM calls plan_approve itself, which is cleaner (single input path) and more flexible (no hardcoded misspelling list).	2026-02-06 20:16:11 +11:00
Dhanji R. Prasanna	799b4ced8e	Remove auto-submit status prompt from /project command The /project command was auto-invoking a status report ("what is the current state of the project?") as the first user message after loading project files. This was inconsistent with the --project flag behavior, which only loads files and displays status without auto-prompting. Removed the auto-submit lines so /project now behaves identically to the --project CLI flag: load files, set context, display status, done.	2026-02-06 16:12:33 +11:00
Dhanji R. Prasanna	7032e75fc6	Add write_envelope tool with verify_envelope for explicit envelope creation - New crates/g3-core/src/tools/envelope.rs with execute_write_envelope() and verify_envelope() (moved from shadow_datalog_verify in plan.rs) - write_envelope accepts YAML facts, writes envelope.yaml to session dir, then runs datalog verification against analysis/rulespec.yaml in shadow mode - plan_verify() now only checks envelope existence (no longer runs datalog) - Tool count: 13 -> 14 - Updated system prompt to instruct agents to call write_envelope before marking last plan item done - Updated integration tests to use write_envelope tool directly Workflow: write_envelope -> verify_envelope -> datalog shadow artifacts plan_write(done) -> plan_verify -> checks envelope exists	2026-02-06 16:09:07 +11:00
Dhanji R. Prasanna	f7a240a99b	refactor: decouple rulespec from plan_write, read from analysis/rulespec.yaml - Remove rulespec parameter from plan_write tool definition and execution - Remove rulespec compilation from plan_approve (no longer pre-compiles) - Remove write_rulespec, get_rulespec_path, format_rulespec_yaml/markdown from invariants.rs; read_rulespec() now takes &Path working dir - Remove save/load_compiled_rulespec, get_compiled_rulespec_path from datalog.rs - Update shadow_datalog_verify() to compile on-the-fly from analysis/rulespec.yaml, writing rulespec.compiled.dl and datalog_evaluation.txt to session dir - Remove rulespec display from plan_read output - Remove Invariants/Rulespec section from native.md system prompt - Remove rulespec from prompts.rs plan_write format and examples - Update existing tests to remove rulespec from plan_write calls - Add 3 integration tests for on-the-fly rulespec verification	2026-02-06 15:31:23 +11:00
Dhanji R. Prasanna	a93ce932a3	refactor: Clean up Cargo dependencies - remove unused, update outdated - Remove unused const_format from g3-planner (never imported) - Remove unused thiserror from workspace and 5 crates (declared but never used) - Update termimad 0.31 -> 0.34 in studio (consistency with g3-cli) - Update indicatif 0.17 -> 0.18 in g3-cli - Update ratatui 0.29 -> 0.30 in g3-cli - Update walkdir 2.4 -> 2.5 in g3-core - Update image 0.24 -> 0.25 in g3-computer-control (macOS + Linux) - Update config 0.14 -> 0.15 in workspace Blocked: reqwest 0.11 -> 0.12/0.13 requires breaking API changes to bytes_stream() used in 4 providers - needs separate migration effort. All tests pass. No behavior changes. Agent: fowler	2026-02-06 14:22:59 +11:00
Dhanji R. Prasanna	31bdcb651b	feat(cli): add multiline input support with Alt+Enter - Enable custom-bindings feature in rustyline - Bind Alt+Enter to insert newlines in interactive and accumulative modes - Update calculate_visual_lines() to handle embedded newlines correctly - Add tests for multiline visual line calculation Note: Shift+Enter is not distinguishable in standard terminals, so Alt+Enter is used as the multiline input trigger.	2026-02-06 14:09:12 +11:00
Dhanji R. Prasanna	abfac197ab	Add datalog-based invariant verification system Implement a new datalog verification layer using datafrog that: - Compiles rulespec to datalog on plan_approve - Extracts facts from action envelope using selectors - Executes datalog rules on plan_verify - Writes evaluation results to datalog_evaluation.txt (shadow mode) Key components: - crates/g3-core/src/tools/datalog.rs: Full datalog module with: - compile_rulespec(): Validates and compiles rulespec - extract_facts(): Extracts facts from envelope YAML - execute_rules(): Runs datafrog iteration - 23 comprehensive tests - crates/g3-core/src/tools/plan.rs: - execute_plan_approve(): Now compiles rulespec on approval - shadow_datalog_verify(): Runs datalog and writes to eval file Results are written to .g3/sessions/<id>/datalog_evaluation.txt for inspection, NOT injected into context window (shadow mode).	2026-02-06 13:50:54 +11:00
Dhanji R. Prasanna	bcd50190c6	Add explicit [plan mode] indicator to interactive prompt - Change plan mode prompt from ' >> ' to ' [plan mode] >> ' for clarity - Add magenta syntax highlighting for [plan mode] text in prompt - Add tests for prompt highlighting behavior	2026-02-06 11:31:07 +11:00
Dhanji R. Prasanna	f35807b728	refactor: move research tools to loadable toolset Migrate research and research_status tools from core tools to a dynamically loadable toolset, following the same pattern as webdriver. Changes: - Add 'research' toolset to TOOLSET_REGISTRY in toolsets.rs - Add create_research_tools() function with research and research_status - Remove research tools from create_core_tools() in tool_definitions.rs - Remove exclude_research field and with_research_excluded() from ToolConfig - Update tests: core tools now 13 (was 15), added 3 research toolset tests The agent must now call load_toolset('research') to use research tools. This simplifies the default tool set and removes special-case logic for the scout agent (which simply won't load the research toolset).	2026-02-06 11:17:32 +11:00
Dhanji R. Prasanna	cbced3390c	feat: JIT-injectable toolsets with load_toolset tool Implement dynamic tool loading system that allows tools to be loaded on-demand rather than included in the default set. Key changes: - Add toolsets module with registry of loadable toolsets - Add load_toolset tool that returns tool definitions for a named toolset - Add <available_toolsets> section to system prompt - Track loaded toolsets in Agent, extend tool definitions dynamically - Move webdriver (15 tools) to JIT-only loading Benefits: - Leaner default context (fewer tokens consumed) - On-demand loading when agent needs specialized tools - Extensible registry for future toolsets - Idempotent loading with helpful error messages Files: - crates/g3-core/src/toolsets.rs (new) - crates/g3-core/src/tools/toolsets.rs (new) - crates/g3-core/src/tool_definitions.rs - crates/g3-core/src/tool_dispatch.rs - crates/g3-core/src/prompts.rs - crates/g3-core/src/lib.rs - crates/g3-core/src/tools/executor.rs	2026-02-06 09:35:11 +11:00
Dhanji R. Prasanna	ff15db44c0	Restore research as first-class tool, remove research skill Restores the research tool that was previously externalized as a skill: - Add pending_research.rs: PendingResearchManager with thread-safe task tracking - Add tools/research.rs: execute_research (async), execute_research_status - Add research/research_status tool definitions with exclude_research config - Integrate PendingResearchManager into Agent and ToolContext - Inject completed research results in streaming loop Remove research skill: - Clear EMBEDDED_SKILLS array in embedded.rs - Delete skills/research/ directory - Update all tests expecting embedded research skill - Update docs and memory to reflect the change The research tool now: - Spawns scout agent in background tokio task - Returns immediately with research_id - Automatically injects results into conversation when ready - Supports status checks via research_status tool	2026-02-06 07:38:06 +11:00
Dhanji R. Prasanna	b673827076	Fix embedded skill loading: stop XML-escaping location paths The <location> field in the skills XML prompt was being XML-escaped, converting <embedded:research>/SKILL.md to <embedded:research>/SKILL.md. When the LLM tried to use read_file with this escaped path, it would fail. Changes: - Remove escape_xml() call from location field in prompt.rs - Add fallback handling for escaped paths in try_read_embedded_skill() - Add tests for both prompt generation and read_file handling Fixes embedded skill loading for agents like butler running outside the g3 repo.	2026-02-05 23:16:40 +11:00
Dhanji R. Prasanna	65b2ec368f	Add Action Envelope section back to native prompt Restored the Action Envelope instructions with a clear, complete example showing how to write envelope.yaml for rulespec verification.	2026-02-05 22:27:29 +11:00
Dhanji R. Prasanna	3823f8b5f3	Optimize native system prompt - 48% size reduction Removed redundant and vague content from prompts/system/native.md: - Simplified intro from 17 lines to 3 lines - Reduced Code Search section to one line - Removed duplicate Plan Mode example (kept one) - Removed Action Envelope section (rarely used correctly) - Removed verbose Memory Format details (tool description covers it) - Removed Response Guidelines (obvious to modern LLMs) Size: 8,620 chars -> 4,498 chars Also updated: - G3_IDENTITY_LINE constant for agent mode compatibility - Test assertions to check for new prompt markers - System prompt validation to use new marker string	2026-02-05 22:16:34 +11:00
Dhanji R. Prasanna	d978032044	Remove redundant AGENTS.md heading from startup output The loaded status line (✓ AGENTS.md ✓ Memory) already indicates that AGENTS.md was loaded, so the separate '>> AGENTS.md - Machine Instructions' heading line was redundant. - Remove print_project_heading() function from display.rs - Remove extract_project_heading call from interactive.rs - Clean up unused imports	2026-02-05 21:38:47 +11:00
Dhanji R. Prasanna	c6df75d886	Fix shell tool output line clipping to account for suffix The shell tool output line was wrapping because update_tool_output_line clipped the content without reserving space for the suffix that gets appended later (line count + timing info). Added suffix_overhead of 30 chars for shell tools to reserve space for: - " (9999 lines)" = ~13 chars - " \| 99999 ◉ 999ms" = ~17 chars This ensures the complete line fits within terminal width without wrapping.	2026-02-05 21:23:00 +11:00
Dhanji R. Prasanna	7e2d9bc22c	Enforce rulespec creation with plan_write for new plans Solves the tautology problem where the LLM would write invariants after implementation, making them match what was done rather than constrain it. Changes: - plan_write now accepts 'rulespec' parameter - New plans REQUIRE rulespec (fails with helpful error if missing) - Plan updates don't require rulespec (backward compatible) - Rulespec is parsed, validated, and written atomically with plan - Updated system prompt with clear examples for new vs update - Updated tool definition schema - Updated all affected tests New flow: task → plan+rulespec → user reviews BOTH → approve → implement	2026-02-05 21:12:02 +11:00
Dhanji R. Prasanna	085688479b	Improve terminal width responsiveness for tool output Clip summary text and other long fields to fit terminal width: - Clip display_summary in print_tool_compact (e.g., "47 lines (2.0k chars)") - Account for header_suffix length when compressing paths in print_tool_output_header - Clip TODO item lines in print_todo_compact - Clip plan item descriptions, evidence, touches, checks, and paths in print_plan_compact - Replace hardcoded 70/40 char limits with dynamic terminal-width-based clipping All clipping uses clip_line() which handles UTF-8 safely and adds ellipsis.	2026-02-05 20:44:12 +11:00
Dhanji R. Prasanna	19162b1fe6	Exit plan mode when plan is completed or blocked When a plan reaches a terminal state (all items done or blocked) in interactive mode, automatically exit plan mode and return to normal prompt. Changes: - Add Agent::is_plan_terminal() method to check if plan is complete - Add check_and_exit_plan_mode_if_terminal() helper in interactive.rs - Call the helper after each execute_user_input() to detect completion Fixes issue where plan mode prompt ' >> ' persisted after plan completion.	2026-02-05 20:31:24 +11:00
Dhanji R. Prasanna	30627bce97	feat(cli): make tool output responsive to terminal width - Add terminal_width module with get_terminal_width(), clip_line(), compress_path(), and compress_command() utilities - Update ConsoleUiWriter to use dynamic terminal width for all tool output - Tool output lines are clipped to fit without wrapping - Tool headers use semantic compression (paths preserve filename, commands clip from right) - 4-character right margin for visual clarity - Minimum 40 columns, default 80 when terminal size unavailable - All truncation is UTF-8 safe (char counting, not byte slicing) - Add 13 unit tests for terminal width utilities	2026-02-05 20:18:30 +11:00
Dhanji R. Prasanna	b2fbcf33d0	Fix plan approval gate and add "Create a plan:" prefix for first message - Fix build warnings: add #[allow(dead_code)] to unused deserialization fields - Fix plan approval gate bug: block file changes when no plan exists (not just when plan exists but is unapproved) - Add "Create a plan: " prefix to first user message in plan mode - Add prepare_plan_mode_input() helper function for testability - Reset is_first_plan_message flag when entering plan mode via /plan command - Add tests for approval gate (no plan + no changes, no plan + changes) - Add tests for prepare_plan_mode_input (happy, negative, boundary cases)	2026-02-05 19:43:38 +11:00
Dhanji R. Prasanna	06d75f613c	feat(plan): display rulespec.yaml and envelope.yaml in plan_read/plan_write output - Add format_envelope_markdown() function in invariants.rs for rich markdown formatting of ActionEnvelope facts - Add format_yaml_value_markdown() helper for recursive YAML value display - Update execute_plan_read() to append rulespec and envelope sections - Update execute_plan_write() to append envelope section alongside rulespec - Add 3 tests for format_envelope_markdown (empty, with facts, null values) When plan_read or plan_write is called, the output now includes: - Plan YAML (as before) - Rulespec section (if rulespec.yaml exists) with invariants grouped by source - Envelope section (if envelope.yaml exists) with facts in readable format Missing files show placeholder text rather than errors.	2026-02-05 19:08:55 +11:00
Dhanji R. Prasanna	bc5c1bdf61	Fix plan UI formatting to handle Vec<Check> and display elegantly - Update ChecksCompact to use Vec<CheckCompact> for negative/boundary fields - Add progress bar visualization showing done/doing/blocked/todo counts - Show evidence for done items, checks for active items - Display all negative and boundary checks (not just first) - Add proper tree structure with └/├ prefixes - Truncate long descriptions and evidence paths - Add file path display with 📄 icon	2026-02-05 14:38:18 +11:00
Dhanji R. Prasanna	e34f37fd47	Merge sessions/sdlc/3b6c6c3e into main Resolved conflicts: - analysis/memory.md: kept condensed documentation from incoming branch - crates/g3-core/src/skills/embedded.rs: removed unused HashMap import, kept better doc comment Additional fix: - crates/g3-core/src/prompts.rs: updated test to match current prompt file content	2026-02-05 14:38:08 +11:00
Dhanji R. Prasanna	307f04fa25	chore: Compress workspace memory after research externalization - Remove deleted code: pending_research.rs, tools/research.rs (externalized to skill) - Merge duplicate Agent Skills entries into unified section - Update SDLC state path: analysis/sdlc/ → .g3/sdlc/ - Remove G3Status.resuming() (deleted in `6228001`) - Tighten verbose descriptions throughout Metrics: 444 → 325 lines (-27%), 23.6k → 17.0k chars (-28%) Concepts preserved: all semantic information retained Agent: huffman	2026-02-05 14:29:48 +11:00
Dhanji R. Prasanna	74c2671e1b	docs: Update documentation for Agent Skills system Document the new Skills system introduced in recent commits: - docs/architecture.md: Add Skills System section with discovery priority, embedded skills, script extraction, and key types - docs/skills.md: New comprehensive guide covering SKILL.md format, discovery priority, embedded skills, research skill usage, and troubleshooting - README.md: Update Agent Skills section with correct priority order, add embedded skills info, research skill usage, and link to Skills Guide in Documentation Map - AGENTS.md: Add skill creation to Adding Features, skill extraction to Dangerous Code Paths, and new Skills System Entry Points section All documentation links validated - no broken links or orphan files. Agent: lamport	2026-02-05 14:26:26 +11:00
Dhanji R. Prasanna	cff32bf0ba	Make research skill self-contained without external scripts - Rewrite SKILL.md with inline instructions to spawn g3 --agent scout directly - Extend read_file to handle embedded skill paths (<embedded:name>/SKILL.md) - Remove scripts field from EmbeddedSkill struct (no longer needed) - Delete extraction.rs module (was only for script extraction) - Delete g3-research bash script - Remove obsolete Async Research Tool section from workspace memory Skills are now fully portable - they work when g3 is installed as a binary without access to source files. Agents can read embedded skill content via read_file with the special <embedded:...> path syntax.	2026-02-05 14:22:17 +11:00
Dhanji R. Prasanna	c3549ce043	refactor: Remove unused functions from skills module - Remove is_embedded_skill() from discovery.rs (unused) - Remove get_embedded_skills_map() from embedded.rs (unused) - Remove associated tests for deleted functions - Inline path check in test_repo_overrides_embedded test This eliminates dead code warnings and reduces module surface area without changing any behavior. Agent: fowler	2026-02-05 14:17:56 +11:00
Dhanji R. Prasanna	38da6a56ef	analysis: Update dependency graph for commits b6d2582..9443f933 Focused analysis on past 10 commits covering: - New skills module in g3-core (parser, discovery, prompt, embedded, extraction) - Research tool externalized to skills/research/ skill - SkillsConfig added to g3-config - SDLC pipeline state moved to .g3/sdlc/ Key findings: - 4 crates changed, 29 files affected (8 added, 2 deleted, 19 modified) - No dependency cycles detected - Clean DAG structure in new skills module - Cross-crate coupling via g3-core::skills and g3-config::SkillsConfig - Compile-time coupling to skills/research/ via include_str! Agent: euler	2026-02-05 14:02:44 +11:00
Dhanji R. Prasanna	788debb93a	remove cruft from system prompt	2026-02-05 14:01:26 +11:00
Dhanji R. Prasanna	68fd7b96c1	Remove accidental Emacs lock file	2026-02-05 14:01:03 +11:00
Dhanji R. Prasanna	6cb70f26fa	Fix empty Language-Specific Guidance header in system prompt When a Rust-only workspace was detected, the Language-Specific Guidance header was appearing with no content because Rust has an empty prompt string (agent-specific prompts handle Rust instead). The fix filters out empty prompt strings in get_language_prompts_for_workspace() so the header only appears when there's actual guidance content. Added test to verify Rust-only workspaces return None.	2026-02-05 14:00:52 +11:00
Dhanji R. Prasanna	9443f9333b	refactor: Remove hardcoded Web Research section from system prompt - Web Research instructions now come from skills/research/SKILL.md - Skills are dynamically loaded and injected via generate_skills_prompt() - Remove test_both_prompts_have_web_research test (no longer applicable) - Remove unused G3Status::research_complete() function This completes the externalization of research as a skill.	2026-02-05 13:41:53 +11:00
Dhanji R. Prasanna	0b308853a0	fix: Improve research skill with ANSI stripping and fallback extraction - Add strip_ansi() function using perl for comprehensive escape sequence removal - Add fallback extraction when scout doesn't output markers - Strip g3 UI elements (session banner, tool output chrome, auto-memory messages) - Reports are now clean plaintext without terminal formatting	2026-02-05 13:35:32 +11:00
Dhanji R. Prasanna	39e586982c	feat: Externalize research tool as embedded skill Replaces the built-in research/research_status tools with a portable skill-based approach: - Add embedded skills infrastructure (skills compiled into binary) - Add repo-local skills/ directory support (highest priority) - Create research skill with SKILL.md and g3-research shell script - Script extraction to .g3/bin/ with version tracking - Filesystem-based handoff via .g3/research/<id>/status.json - Remove PendingResearchManager and all research tool code - Update system prompt to reference skill instead of tool Benefits: - No special tool infrastructure needed (just shell + read_file) - Context-efficient (reports stay on disk until needed) - Crash-resilient (state persisted to filesystem) - Portable (skill can be overridden per-workspace) Breaking change: research tool calls now return a deprecation message pointing to the research skill.	2026-02-05 13:23:26 +11:00
Dhanji R. Prasanna	bf9e3dc878	Merge sessions/interactive/213d9910	2026-02-05 13:05:57 +11:00
Dhanji R. Prasanna	89c071baf6	fix: honor --resume flag when used with --agent --chat The --resume flag was being ignored when --agent and --chat flags were used together. The if-else chain checked for chat mode first and immediately returned None, skipping the --resume check entirely. Reordered the logic to check flags.resume first, ensuring explicit --resume is always honored regardless of other flags. Fixes: --resume not working with --agent --chat	2026-02-05 13:05:48 +11:00
Dhanji R. Prasanna	bc2860dd3a	studio sdlc: merge worktree on completion, move state to .g3/ - Add merge step before worktree cleanup when pipeline completes - On success with commits: merge to main, then cleanup - On failure: preserve worktree for debugging, print path - On merge conflict: preserve worktree, print resolution instructions - Move pipeline.json from analysis/sdlc/ to .g3/sdlc/ (gitignored)	2026-02-05 13:03:54 +11:00
Dhanji R. Prasanna	0e64f13a8a	Merge feature/agent-skills-support: Agent Skills specification support	2026-02-05 12:46:53 +11:00
Dhanji R. Prasanna	6228001bfc	Remove automatic session resume suggestion on startup - Remove the interactive prompt that asked users to resume in-progress sessions - Remove unused new_session parameter from run_interactive() - Remove unused info_inline() function from G3Status - Explicit --resume <session_id> flag still works	2026-02-05 12:40:27 +11:00
Dhanji R. Prasanna	8bbaf6f02e	Tighten system prompt and tool definitions Prompt changes (native.md): - Remove duplicate 'Temporary files' section - Consolidate 'remember' instructions into single authoritative location - Remove motivational 'Benefits' list from Plan Mode - Add 'Code Search Tool Selection' guidance (code_search vs rg) Tool changes (tool_definitions.rs, tool_dispatch.rs): - Remove screenshot tool (webdriver_screenshot remains) - Remove coverage tool - Reduce plan_write description from 22 lines to 1 line - Update tool count tests (16 -> 14 core tools) Net result: ~6 lines removed from prompt, ~56 lines removed from tool definitions, clearer tool selection guidance added.	2026-02-05 12:36:49 +11:00
Dhanji R. Prasanna	b6d25824f3	Tighten system prompt	2026-02-05 12:01:01 +11:00
Dhanji R. Prasanna	25ad198b83	Sync agent plan mode state on CLI startup CLI starts in plan mode by default (when not in agent mode), but was not calling agent.set_plan_mode(true) at initialization. This meant the gate check would not run until the user explicitly entered plan mode via /plan.	2026-02-05 11:47:38 +11:00
Dhanji R. Prasanna	b86901a86b	Merge sessions/interactive/47299e3b	2026-02-05 11:47:24 +11:00
Dhanji R. Prasanna	3d3f68e6da	Externalize native system prompt to markdown file - Move system prompt for native tool calling models to prompts/system/native.md - Use include_str! to embed at compile time - Remove concatenated SHARED_* string constants - Prompt is now readable/editable as a complete markdown document - Non-native prompt still uses Rust constants (acceptable for now)	2026-02-05 11:46:49 +11:00
Dhanji R. Prasanna	0f919237ea	Make plan approval gate only active in plan mode - Add in_plan_mode flag to Agent struct - Add set_plan_mode() and is_plan_mode() methods - Gate check now only runs when in_plan_mode is true - CLI calls set_plan_mode(true) on /plan command and EnterPlanMode - CLI calls set_plan_mode(false) on approval and CTRL-D exit - Update integration test to enable plan mode - Fix test YAML to use Vec<Check> for negative/boundary checks	2026-02-05 11:41:52 +11:00
Dhanji R. Prasanna	3d284b8b60	Merge sessions/interactive/179ac8a6	2026-02-05 11:37:07 +11:00
Dhanji R. Prasanna	1f1a517620	feat(plan): support multiple negative and boundary checks Change Plan Mode to allow multiple negative and boundary checks per item, while keeping happy path as a single check. Schema change: - checks.negative: Check -> Vec<Check> (>=1 required) - checks.boundary: Check -> Vec<Check> (>=1 required) - checks.happy: Check (unchanged, single) This better reflects real-world tasks where there are often multiple error conditions and edge cases worth tracking. Changes: - Update Checks struct to use Vec<Check> for negative/boundary - Update validation to require at least 1 of each - Update prompts and tool definitions with new array syntax - Add 4 new tests for multi-check scenarios	2026-02-05 11:36:45 +11:00

1 2 3 4 5 ...

801 Commits