alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	3d3f68e6da	Externalize native system prompt to markdown file - Move system prompt for native tool calling models to prompts/system/native.md - Use include_str! to embed at compile time - Remove concatenated SHARED_* string constants - Prompt is now readable/editable as a complete markdown document - Non-native prompt still uses Rust constants (acceptable for now)	2026-02-05 11:46:49 +11:00
Dhanji R. Prasanna	3d284b8b60	Merge sessions/interactive/179ac8a6	2026-02-05 11:37:07 +11:00
Dhanji R. Prasanna	1f1a517620	feat(plan): support multiple negative and boundary checks Change Plan Mode to allow multiple negative and boundary checks per item, while keeping happy path as a single check. Schema change: - checks.negative: Check -> Vec<Check> (>=1 required) - checks.boundary: Check -> Vec<Check> (>=1 required) - checks.happy: Check (unchanged, single) This better reflects real-world tasks where there are often multiple error conditions and edge cases worth tracking. Changes: - Update Checks struct to use Vec<Check> for negative/boundary - Update validation to require at least 1 of each - Update prompts and tool definitions with new array syntax - Add 4 new tests for multi-check scenarios	2026-02-05 11:36:45 +11:00
Dhanji R. Prasanna	41839b909e	Remove stray test file	2026-02-05 11:34:15 +11:00
Dhanji R. Prasanna	c347a73cbd	Add plan approval gate to block file changes without approved plan - Add check_plan_approval_gate() in tools/plan.rs that runs after each tool call - Detects file changes via git status --porcelain when plan exists but not approved - Reverts changes: git checkout for modified files, rm for new untracked files - Returns blocking message instructing LLM to create/approve plan first - Add ApprovalGateResult enum with Allowed/Blocked/NotGitRepo variants - Add set_session_id() and set_working_dir() methods on Agent for testing - Add integration test using MockProvider to simulate blocked write_file	2026-02-05 11:34:10 +11:00
Dhanji R. Prasanna	add8060526	Add studio sdlc command for SDLC maintenance pipeline Implements a pipeline that orchestrates 7 g3 agents in sequence: 1. euler - dependency graph and hotspots analysis 2. breaker - whitebox exploration and edge-case discovery 3. hopper - deep testing and regression integrity 4. fowler - refactoring to deduplicate and reduce complexity 5. carmack - in-place rewriting for readability and concision 6. lamport - human-readable documentation and validation 7. huffman - semantic compression of memory Features: - Commit cursor tracking (--from flag to set starting point) - Crash recovery (resumes from last incomplete stage) - Git worktree isolation for all pipeline work - Visual pipeline display with status icons - Summary generation saved to .g3/sessions/sdlc/ - Pipeline state persisted to analysis/sdlc/pipeline.json CLI: - studio sdlc run [-c N] [--from COMMIT] - studio sdlc status - studio sdlc reset Also adds huffman agent to embedded agents list.	2026-02-05 10:46:10 +11:00
Dhanji R. Prasanna	fdb1255f02	Add --resume <session-id> flag for explicit session resumption - Add --resume CLI flag that conflicts with --new-session - Add load_continuation_by_id() to load sessions by full or partial ID - Support loading from latest.json or falling back to session.json - Handle --resume in both normal and agent modes - Agent mode validates session belongs to correct agent	2026-02-05 10:23:39 +11:00
Dhanji R. Prasanna	3046f0dd6e	feat: Add invariants system for Plan Mode verification Adds rulespec.yaml and envelope.yaml support for machine-readable invariant checking during plan completion. - Add invariants module with Rulespec, ActionEnvelope, and evaluation logic - Add Invariants section to system prompt with workflow instructions - Show rulespec/envelope file status in plan verification output - Rulespec written during planning (captures constraints from task) - Envelope written after implementation (documents what was built)	2026-02-04 20:49:58 +11:00
Dhanji R. Prasanna	95d9847354	Update dependency analysis artifacts with detailed evidence - hotspots.md: Added specific dependent file lists for each hotspot - hotspots.md: Added cross-crate coupling points table - hotspots.md: Added crate-level coupling scores - limitations.md: Expanded coverage of unobservable patterns - limitations.md: Added confidence levels for inferences - limitations.md: Added extraction method details table Agent: euler	2026-02-02 17:20:15 +11:00
Dhanji R. Prasanna	263a838d31	Remove redundant 'No plan exists' message from plan_read output The UI already shows 'empty' via print_plan_compact, so returning an empty string avoids duplicate output.	2026-02-02 17:19:01 +11:00
Dhanji R. Prasanna	e332109273	Auto-approve plans in non-interactive (autonomous/one-shot) mode - Add auto-approval logic in execute_plan_write() when ctx.is_autonomous is true - Update system prompt to document auto-approval behavior - Plans still require explicit approval in interactive mode	2026-02-02 17:16:21 +11:00
Dhanji R. Prasanna	0aead8d86d	fix: Enable compact UI output for plan_approve tool Added plan_approve to the compact tool list in format_tool_result_summary() so it displays in the same format as other tools like read_file and write_file. The format_plan_approve_summary() function already existed but was never called because plan_approve was missing from the matches! block.	2026-02-02 17:06:10 +11:00
Dhanji R. Prasanna	f8448e5622	feat: Plan Mode interactive flow with approval shortcuts - Start g3 in plan mode with ' >>' prompt and welcome message - Add is_approval_input() to detect 'approve', 'a', 'yes', etc. and misspellings - Allow trailing punctuation (!, ., ,) on approval words - Call plan_approve tool directly without LLM when approval detected - Add synthetic assistant message after approval for LLM context - Exit plan mode after successful approval, return to 'g3>' prompt - CTRL-D in plan mode exits plan mode first, then exits g3 - /plan command enters plan mode and shows welcome message - Agent mode (--agent) does not start in plan mode - Add CommandResult enum to signal plan mode entry from commands	2026-02-02 16:59:52 +11:00
Dhanji R. Prasanna	9024f693fa	Fix plan tool UI formatting - Fix vertical bar continuation: │ continues all the way down, only the very last sub-line (boundary of last item) gets └ - Add visual gap before plan file path and change 📄 to -> - Dedent file path to align with tree root - Fix plan_approve to use proper compact tool format (was missing from is_compact_tool matches! in print_tool_compact, causing it to fall through to regular output with \| prefix)	2026-02-02 16:29:37 +11:00
Dhanji R. Prasanna	e893794029	Rename /feature command to /plan - Update command matching from /feature to /plan in commands.rs - Update help text, usage message, and example - Update workspace memory references - /feature is no longer recognized (completely removed)	2026-02-02 16:00:09 +11:00
Dhanji R. Prasanna	8705228fda	Fix input formatter bugs: apostrophe highlighting and line duplication Fixes two bugs in the input formatter: 1. Single/double quote regex now requires word boundaries: - Contractions like it's, don't, won't no longer trigger highlighting - Only properly quoted text like 'special' or "hello" gets cyan - Mixed input like "it's a 'test' case" only highlights 'test' 2. Visual line calculation fix for exact terminal width: - When text exactly fills terminal width, cursor wraps to next line - Added +1 adjustment to account for this edge case - Extracted calculate_visual_lines() for testability Added 9 new tests covering all edge cases.	2026-02-02 15:54:38 +11:00
Dhanji R. Prasanna	571188305a	feat: add compact UI output for Plan Mode tools Plan tools (plan_read, plan_write) now display with elegant tree-style formatting similar to the old todo_write UI: - State indicators: □ (todo), ◐ (doing), ■ (done), ⊘ (blocked) - Tree prefixes (├/└) for items with child details - Strikethrough for completed items - Shows touches and all three checks (happy/negative/boundary) - Displays plan file path link at the end plan_approve uses compact single-line format like read_file: - Shows approval status and revision number - Handles already-approved and error cases Changes: - Add print_plan_compact() to UiWriter trait with default impl - Implement print_plan_compact() in ConsoleUiWriter - Call print_plan_compact() from execute_plan_read/write - Add plan_read/plan_write to is_self_handled_tool() - Add plan_approve to is_compact_tool() with format_plan_approve_summary() - Add serde_yaml dependency to g3-cli	2026-02-02 15:30:05 +11:00
Dhanji R. Prasanna	d6b7177107	Implement plan_verify() for deterministic evidence validation Adds a verification system that checks evidence in completed plan items: - Evidence parsing: supports code locations (file:line, file:line-line, file only) and test references (file::test_name) - Code location verification: checks file exists, validates line numbers in range - Test reference verification: checks test file exists, searches for fn pattern - Verification results: Verified, Warning, Error, Skipped statuses - Loud output formatting with emoji indicators for warnings/errors - Integration with execute_plan_write(): runs when plan is complete and approved - 12 new unit tests covering parsing and verification Warnings are advisory (don't block), errors are loud but also don't block. Blocked items are skipped during verification.	2026-02-02 15:15:03 +11:00
Dhanji R. Prasanna	a63950d8f5	Add Plan Mode to replace TODO system Plan Mode is a cognitive forcing system that requires reasoning about: - Happy path - Negative case - Boundary condition New tools: - plan_read: Read current plan for session - plan_write: Create/update plan with YAML content (validates structure) - plan_approve: Mark current revision as approved New command: - /feature <description>: Start Plan Mode for a new feature Plan schema requires: - plan_id, revision, approved_revision - items with id, description, state, touches, checks (happy/negative/boundary) - evidence and notes required when marking items done Verification: - plan_verify() called automatically when all items are done/blocked Removed: - todo_read, todo_write tools - todo.rs module and related tests	2026-02-02 14:38:25 +11:00
Dhanji R. Prasanna	7fc9eb0778	Fix doc-test failure in GLM adapter Use quadruple backticks for outer code fence to properly escape the nested code fence example showing JSON format.	2026-01-30 14:53:04 +11:00
Dhanji R. Prasanna	afc5bc8574	Readability improvements across streaming_parser, input_formatter, commands - streaming_parser.rs: Reduced ~70 lines by removing redundant comments, consolidating doc comments, using slice syntax for TOOL_CALL_PATTERNS - input_formatter.rs: Lazy regex compilation via once_cell (performance), cleaner function structure, reduced comment noise - commands.rs: Extracted format_research_task_summary() and format_research_report_header() helpers, reduced ~40 lines of duplication - pending_research.rs: Fixed 2 unused variable warnings in tests All changes are behavior-preserving. 446 tests pass. Agent: carmack	2026-01-30 14:48:08 +11:00
Dhanji R. Prasanna	51f12769d5	Merge sessions/hopper/297c7be9	2026-01-30 14:30:53 +11:00
Dhanji R. Prasanna	58bbfde6f4	test: add integration tests for streaming parser stuttering bug fix Add characterization tests for the streaming parser stuttering bug fix (`fa3c920`). These tests verify that when an LLM "stutters" and emits incomplete tool call fragments followed by complete tool calls, the parser: 1. Does not get stuck waiting for the incomplete fragment to complete 2. Successfully parses complete tool calls that appear after the fragment Tests cover: - The exact pattern from butler session butler_c6ab59af2e4f991c - Edge cases that should NOT trigger invalidation (nested JSON, patterns in strings) - Recovery behavior after reset - Multiple complete tool calls - Boundary conditions (chunk boundaries, minimal patterns) Agent: hopper	2026-01-30 14:30:27 +11:00
Dhanji R. Prasanna	3003bdebaa	refactor: fix flaky test and remove dead code in recent commits Fixes issues in the last 11 commits: 1. pending_research.rs: Fix flaky test_generate_id_uniqueness - Replaced random u16 suffix with atomic counter for guaranteed uniqueness - The timestamp+random approach could collide when generating IDs rapidly - Now uses static AtomicU32 counter that increments monotonically 2. embedded/adapters/glm.rs: Remove unused in_code_fence field - Field was written but never read (dead code) - Removed from struct definition, constructor, and reset() 3. embedded/adapters/glm.rs: Fix orphaned tests - Two tests (test_strip_code_fences, test_code_fenced_tool_call) were outside the #[cfg(test)] mod tests block - Moved closing brace to include them in the test module All 446 library tests pass. Agent: fowler	2026-01-30 14:28:43 +11:00
Dhanji R. Prasanna	6bb07ce4f5	Merge sessions/interactive/3c2a09df	2026-01-30 14:20:12 +11:00
Dhanji R. Prasanna	f1a5241777	Add /research <id> and /research latest commands Allow users to view research reports directly from the CLI: - /research - List all research tasks (unchanged) - /research <id> - View the full report for a specific research task - /research latest - View the most recent completed research report Report display includes query, status, elapsed time, and full content.	2026-01-30 14:06:28 +11:00
Dhanji R. Prasanna	fa3c9203e0	Fix streaming parser bug: detect abandoned tool call fragments When the LLM 'stutters' and emits incomplete tool call fragments like: {"tool": "shell", "args": {...}} {"tool": {"tool": "shell", "args": {...}} The parser would get stuck waiting for the incomplete fragment to complete, causing the entire response to be lost (no tool executed, no text displayed). This was observed in butler session butler_c6ab59af2e4f991c where the user's 'send!' command produced no response. Fix: Enhanced is_json_invalidated() to detect when a new tool call pattern ({"tool"}) appears after a newline while parsing an incomplete JSON fragment. This indicates the previous fragment was abandoned and should be invalidated. Safety: - Tool patterns inside JSON strings (e.g., writing example code) are not affected because the check only runs outside strings - Added tests for the stuttering pattern and the file-writing edge case	2026-01-30 14:00:18 +11:00
Dhanji R. Prasanna	f93d05f444	Add real-time research completion notifications When background research completes, g3 now immediately prints a status message instead of waiting for the next user interaction: - Added ResearchCompletionNotification and broadcast channel to PendingResearchManager for push-based notifications - Added spawn_research_notification_handler() in interactive mode that listens for completions in a background task - When idle (at prompt): clears line, prints status, reprints prompt - When busy (processing): prints status inline (interleaving is fine) - Added G3Status::research_complete() for consistent formatting - Added enable_research_notifications() method to Agent Output format: "g3: 1 research report ... [done]"	2026-01-30 13:35:35 +11:00
Dhanji R. Prasanna	5428504777	Fix input formatting bugs: newline, line wrapping, and TTY check Fixes three bugs in the input formatter introduced in `4e16942`: 1. Bug 2 & 3 (missing newline, line duplication): - Changed print! to println! to add trailing newline - Calculate visual lines based on terminal width instead of logical line count, fixing duplication for wrapped lines 2. Bug 1 (^M on non-interactive prompts): - Added TTY check to skip formatting when stdout is not a terminal - Prevents terminal state corruption for stdin prompts	2026-01-30 13:28:31 +11:00
Dhanji R. Prasanna	b252ff443d	Merge sessions/interactive/9681cb67	2026-01-30 13:01:00 +11:00
Dhanji R. Prasanna	5ab1598e03	feat: async research tool - runs in background, returns immediately The research tool now spawns the scout agent in a background tokio task and returns immediately with a research_id placeholder. This allows the agent to continue working while research runs (30-120 seconds). Key changes: - New PendingResearchManager for tracking async research tasks - research tool returns immediately with placeholder containing research_id - research_status tool to check progress of pending research - Auto-injection of completed research at natural break points: - Start of each tool iteration (before LLM call) - Before prompting user in interactive mode - /research CLI command to list all research tasks - Updated system prompt to explain async behavior The agent can: - Continue with other work while research runs - Check status with research_status tool - Yield turn to user if results are critical before continuing	2026-01-30 13:00:02 +11:00
Dhanji R. Prasanna	4e1694248f	Add input formatting for interactive CLI When users type prompts in interactive mode, the input is now reformatted in place with enhanced highlighting: - ALL CAPS words (2+ chars) become bold green (e.g., FIX, BUG, HTTP2) - Quoted text ("..." or ...) becomes cyan - Standard markdown formatting is also supported New module: input_formatter.rs with 10 unit tests Integrated into interactive.rs for both single-line and multiline input	2026-01-30 12:03:36 +11:00
Dhanji R. Prasanna	2e21502357	Fix --project flag not working in agent mode - Add CommonFlags struct to group flags that apply across all modes - Refactor run_agent_mode() to accept CommonFlags instead of individual params - Add project loading logic for agent chat mode - Add integration tests for --project with agent mode This refactor prevents future bugs where new flags work in one mode but are forgotten in another.	2026-01-30 11:28:48 +11:00
Dhanji R. Prasanna	51d22b3282	gemini model perf	2026-01-30 10:09:46 +11:00
Dhanji R. Prasanna	8191a5e8e6	feat(embedded): add GLM tool format adapter for code fence stripping GLM-4 models wrap tool calls in markdown code fences and inline backticks, which prevents the streaming parser from detecting them. This adapter: - Strips ```json and ``` code fence markers during streaming - Strips inline backticks from tool call JSON - Handles chunked streaming correctly (buffers potential fence lines) - Transforms GLM native format (<\|assistant\|>tool_name) to g3 JSON format Also refactors embedded provider into module structure: - embedded/mod.rs - module exports - embedded/provider.rs - main EmbeddedProvider (moved from embedded.rs) - embedded/adapters/mod.rs - ToolFormatAdapter trait - embedded/adapters/glm.rs - GLM-specific adapter Includes 22 unit tests covering edge cases like nested JSON in strings, chunk boundary handling, and false pattern detection. Updates README to show GLM-4 9B now works (⭐⭐) for agentic tasks.	2026-01-29 12:52:09 +11:00
Dhanji R. Prasanna	457ba35f80	docs: Fix documentation accuracy and add missing Gemini provider Corrections made: - docs/architecture.md: Fix crate count from 9 to 8 (actual count) - docs/tools.md: Fix code_search supported languages (kotlin -> haskell, scheme, racket) - docs/CODE_SEARCH.md: Add missing Haskell and Scheme to supported languages list - docs/providers.md: Add complete Gemini provider documentation section - docs/configuration.md: Add Gemini configuration section The Gemini provider (crates/g3-providers/src/gemini.rs) was fully implemented but not documented. The code_search tool actually supports haskell and scheme (via tree-sitter) but documentation incorrectly listed kotlin. Agent: lamport	2026-01-29 12:06:53 +11:00
Dhanji R. Prasanna	f9e0b94cc1	tiny tweak	2026-01-29 12:02:11 +11:00
Dhanji R. Prasanna	853237e62e	Update dependency analysis artifacts Generated comprehensive static dependency analysis for g3 workspace: - graph.json: 108 nodes (9 crates, 99 files), 186 edges - graph.summary.md: Overview with metrics, entrypoints, fan-in/fan-out rankings - sccs.md: No cycles detected (DAG structure confirmed) - layers.observed.md: 4-layer crate hierarchy identified - hotspots.md: ui_writer.rs (15 fan-in), agent_mode.rs (13 fan-out) as key nodes - limitations.md: Documents extraction methodology and caveats Updated AGENTS.md with artifact documentation table. Agent: euler	2026-01-29 11:46:39 +11:00
Dhanji R. Prasanna	cba7d31996	Merge sessions/carmack/ee92b215	2026-01-29 11:40:48 +11:00
Dhanji R. Prasanna	d4941dc95a	refactor(providers): improve readability of embedded.rs and gemini.rs embedded.rs (937→789 lines, -16%): - Extract duplicated inference setup into prepare_context() helper - Extract stop sequence handling into find_stop_sequence() and truncate_at_stop_sequence() - Add InferenceParams struct to consolidate request parameter extraction - Add clear section markers for code organization - Tests now use module-level format functions directly (no duplication) gemini.rs: - Extract common request building into build_request() method - Reduces duplication between complete() and stream() methods All 399 unit tests pass. Behavior unchanged. Agent: carmack	2026-01-29 11:39:46 +11:00
Dhanji R. Prasanna	cb3c523edf	Compact workspace memory: -7.5% size, all concepts preserved Transformations applied: - Fixed incorrect line numbers in Streaming Utilities (IterationState 65→166, StreamingState 17→16) - Updated file sizes with verified byte counts (context_window.rs, streaming.rs, compaction.rs, acd.rs) - Tightened verbose descriptions throughout - Removed redundant "Format" column from Chat Template table - Shortened download command (python3 -m huggingface_hub... → huggingface-cli) - Collapsed "Known issues" log-style narrative in Embedded Provider - Removed filler words and redundant explanations Metrics: 224→212 lines (-5%), 12581→11630 chars (-7.5%) All 26 semantic entries preserved. Agent: huffman	2026-01-29 11:38:53 +11:00
Dhanji R. Prasanna	1bff9b0025	huffman tweak to cover more ground	2026-01-29 11:36:09 +11:00
Dhanji R. Prasanna	653c5f72ac	Compact workspace memory: 402→224 lines (-44%), 22k→12.6k chars (-43%) Merged duplicate entries: - Context Window & Compaction + Context Compaction → unified section - Streaming Markdown Formatter + Code Blocks → single entry - CLI Argument Parsing + CLI Entry Points + CLI Module Structure → CLI Module Structure - Auto-Memory Feature + Tool Call Tracking + Auto-Memory Reminder Format → Auto-Memory System - Agent Mode folded into CLI Module Structure Tightened verbose sections: - UTF-8 pattern: removed 10-line code example, kept pattern + danger zones - ACD Fragment Storage: replaced 15-line JSON with inline field list - GLM-4 downloads: replaced 12-line bash with table + single download template Entry count: 37 → 26 (-30%) All char ranges, function names, and gotchas preserved. Agent: huffman	2026-01-29 11:34:17 +11:00
Dhanji R. Prasanna	bd4473b75f	model performance tweaks to readme	2026-01-29 11:31:29 +11:00
Dhanji R. Prasanna	1bff9d5dcc	tiny tweaks to huffman	2026-01-29 11:31:17 +11:00
Dhanji R. Prasanna	7cf9c3b7bb	Merge sessions/hopper/8e287188	2026-01-29 11:30:54 +11:00
Dhanji R. Prasanna	21f8d5a1aa	Add integration tests for CacheStats and Gemini serialization Agent: hopper Added two new integration test files: 1. cache_stats_integration_test.rs (g3-core) - Tests CacheStats accumulation through streaming completion flow - Verifies cache hit detection (cache_read_tokens > 0) - Tests multi-request accumulation of cache statistics - Verifies cache efficiency and hit rate calculations - Uses MockProvider to simulate provider usage data 2. gemini_serialization_test.rs (g3-providers) - Tests Gemini API message format conversion - Verifies system messages become system_instruction - Verifies assistant role maps to "model" (Gemini terminology) - Tests tool conversion to function_declarations format - Characterizes multi-system-message behavior (last wins) Both test files follow blackbox/integration testing principles: - Test observable behavior through stable surfaces - Do not assert internal implementation details - Include documentation of what is/is not asserted	2026-01-29 11:28:52 +11:00
Dhanji R. Prasanna	570a824780	Rename archivist agent to huffman Named after David Huffman, inventor of Huffman coding - compression that preserves information with fewer bits. Fits the agent's purpose: compact memory, preserve semantics.	2026-01-29 11:22:59 +11:00
Dhanji R. Prasanna	627dd45966	Add archivist to built-in agents list in README	2026-01-29 11:20:23 +11:00
Dhanji R. Prasanna	b45ff37b68	Add archivist agent for memory compaction and signal optimization New agent that maintains workspace memory quality: - Deduplicates entries within memory - Tightens verbose phrasing to terse declarations - Collapses log-style narratives to current-state facts - Removes AGENTS.md ↔ Memory duplication - Ports code locations from AGENTS.md to Memory Goal: increase signal, reduce noise, preserve all semantic information. Agent: archivist	2026-01-29 11:19:47 +11:00

1 2 3 4 5 ...

753 Commits