alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	edbae60ff3	Add rulespec extensions: new predicate rules, when conditions, null handling, solon agent Features: - New predicate rules: NotContains, AnyOf, NoneOf - Conditional predicates via when clauses (WhenCondition/CompiledWhenCondition) - Null handling: YAML null treated as absent for exists/not_exists - Solon agent for rulespec authoring (agents/solon.md) - Rulespec schema documentation (prompts/schemas/rulespec.schema.md) Bugfix: - Fixed when condition evaluation in datalog path: catch-all branch did naive string contains instead of delegating to evaluate_predicate_datalog(). Rules like matches (regex) were silently ignored, causing vacuous pass and letting violations through. Now delegates to evaluate_predicate_datalog() which handles all 12 rule types correctly. Tests: 34 new tests covering all new rules, null handling, when conditions, and the when+matches bugfix (butler rulespec pattern).	2026-02-07 16:38:27 +11:00
Dhanji R. Prasanna	328eecfcad	fix: extract_facts fallback for facts-prefixed selectors in datalog verification Root cause: ActionEnvelope.to_yaml_value() creates a Mapping from the facts HashMap without a 'facts:' wrapper key, but rulespec selectors may include a 'facts.' prefix (e.g. 'facts.feature.done' instead of 'feature.done'). This caused zero facts to be extracted, making all predicate evaluations fail. Fix: extract_facts() now tries the selector against the unwrapped envelope value first, and if empty, retries against a facts-wrapped version as fallback. Also: - Strengthened write_envelope tool description to require top-level facts: key, file paths for evidence, and allow free-form notes - Updated system prompt with matching rules - Added 6 new tests (4 unit, 2 integration) - Strengthened existing integration test to verify fact count > 0	2026-02-07 14:42:39 +11:00
Dhanji R. Prasanna	6c8e334793	chore: update workspace memory with datalog program generation notes	2026-02-07 12:41:37 +11:00
Dhanji R. Prasanna	7032e75fc6	Add write_envelope tool with verify_envelope for explicit envelope creation - New crates/g3-core/src/tools/envelope.rs with execute_write_envelope() and verify_envelope() (moved from shadow_datalog_verify in plan.rs) - write_envelope accepts YAML facts, writes envelope.yaml to session dir, then runs datalog verification against analysis/rulespec.yaml in shadow mode - plan_verify() now only checks envelope existence (no longer runs datalog) - Tool count: 13 -> 14 - Updated system prompt to instruct agents to call write_envelope before marking last plan item done - Updated integration tests to use write_envelope tool directly Workflow: write_envelope -> verify_envelope -> datalog shadow artifacts plan_write(done) -> plan_verify -> checks envelope exists	2026-02-06 16:09:07 +11:00
Dhanji R. Prasanna	f7a240a99b	refactor: decouple rulespec from plan_write, read from analysis/rulespec.yaml - Remove rulespec parameter from plan_write tool definition and execution - Remove rulespec compilation from plan_approve (no longer pre-compiles) - Remove write_rulespec, get_rulespec_path, format_rulespec_yaml/markdown from invariants.rs; read_rulespec() now takes &Path working dir - Remove save/load_compiled_rulespec, get_compiled_rulespec_path from datalog.rs - Update shadow_datalog_verify() to compile on-the-fly from analysis/rulespec.yaml, writing rulespec.compiled.dl and datalog_evaluation.txt to session dir - Remove rulespec display from plan_read output - Remove Invariants/Rulespec section from native.md system prompt - Remove rulespec from prompts.rs plan_write format and examples - Update existing tests to remove rulespec from plan_write calls - Add 3 integration tests for on-the-fly rulespec verification	2026-02-06 15:31:23 +11:00
Dhanji R. Prasanna	abfac197ab	Add datalog-based invariant verification system Implement a new datalog verification layer using datafrog that: - Compiles rulespec to datalog on plan_approve - Extracts facts from action envelope using selectors - Executes datalog rules on plan_verify - Writes evaluation results to datalog_evaluation.txt (shadow mode) Key components: - crates/g3-core/src/tools/datalog.rs: Full datalog module with: - compile_rulespec(): Validates and compiles rulespec - extract_facts(): Extracts facts from envelope YAML - execute_rules(): Runs datafrog iteration - 23 comprehensive tests - crates/g3-core/src/tools/plan.rs: - execute_plan_approve(): Now compiles rulespec on approval - shadow_datalog_verify(): Runs datalog and writes to eval file Results are written to .g3/sessions/<id>/datalog_evaluation.txt for inspection, NOT injected into context window (shadow mode).	2026-02-06 13:50:54 +11:00
Dhanji R. Prasanna	ff15db44c0	Restore research as first-class tool, remove research skill Restores the research tool that was previously externalized as a skill: - Add pending_research.rs: PendingResearchManager with thread-safe task tracking - Add tools/research.rs: execute_research (async), execute_research_status - Add research/research_status tool definitions with exclude_research config - Integrate PendingResearchManager into Agent and ToolContext - Inject completed research results in streaming loop Remove research skill: - Clear EMBEDDED_SKILLS array in embedded.rs - Delete skills/research/ directory - Update all tests expecting embedded research skill - Update docs and memory to reflect the change The research tool now: - Spawns scout agent in background tokio task - Returns immediately with research_id - Automatically injects results into conversation when ready - Supports status checks via research_status tool	2026-02-06 07:38:06 +11:00
Dhanji R. Prasanna	30627bce97	feat(cli): make tool output responsive to terminal width - Add terminal_width module with get_terminal_width(), clip_line(), compress_path(), and compress_command() utilities - Update ConsoleUiWriter to use dynamic terminal width for all tool output - Tool output lines are clipped to fit without wrapping - Tool headers use semantic compression (paths preserve filename, commands clip from right) - 4-character right margin for visual clarity - Minimum 40 columns, default 80 when terminal size unavailable - All truncation is UTF-8 safe (char counting, not byte slicing) - Add 13 unit tests for terminal width utilities	2026-02-05 20:18:30 +11:00
Dhanji R. Prasanna	307f04fa25	chore: Compress workspace memory after research externalization - Remove deleted code: pending_research.rs, tools/research.rs (externalized to skill) - Merge duplicate Agent Skills entries into unified section - Update SDLC state path: analysis/sdlc/ → .g3/sdlc/ - Remove G3Status.resuming() (deleted in `6228001`) - Tighten verbose descriptions throughout Metrics: 444 → 325 lines (-27%), 23.6k → 17.0k chars (-28%) Concepts preserved: all semantic information retained Agent: huffman	2026-02-05 14:29:48 +11:00
Dhanji R. Prasanna	c3549ce043	refactor: Remove unused functions from skills module - Remove is_embedded_skill() from discovery.rs (unused) - Remove get_embedded_skills_map() from embedded.rs (unused) - Remove associated tests for deleted functions - Inline path check in test_repo_overrides_embedded test This eliminates dead code warnings and reduces module surface area without changing any behavior. Agent: fowler	2026-02-05 14:17:56 +11:00
Dhanji R. Prasanna	0e64f13a8a	Merge feature/agent-skills-support: Agent Skills specification support	2026-02-05 12:46:53 +11:00
Dhanji R. Prasanna	add8060526	Add studio sdlc command for SDLC maintenance pipeline Implements a pipeline that orchestrates 7 g3 agents in sequence: 1. euler - dependency graph and hotspots analysis 2. breaker - whitebox exploration and edge-case discovery 3. hopper - deep testing and regression integrity 4. fowler - refactoring to deduplicate and reduce complexity 5. carmack - in-place rewriting for readability and concision 6. lamport - human-readable documentation and validation 7. huffman - semantic compression of memory Features: - Commit cursor tracking (--from flag to set starting point) - Crash recovery (resumes from last incomplete stage) - Git worktree isolation for all pipeline work - Visual pipeline display with status icons - Summary generation saved to .g3/sessions/sdlc/ - Pipeline state persisted to analysis/sdlc/pipeline.json CLI: - studio sdlc run [-c N] [--from COMMIT] - studio sdlc status - studio sdlc reset Also adds huffman agent to embedded agents list.	2026-02-05 10:46:10 +11:00
Dhanji R. Prasanna	3046f0dd6e	feat: Add invariants system for Plan Mode verification Adds rulespec.yaml and envelope.yaml support for machine-readable invariant checking during plan completion. - Add invariants module with Rulespec, ActionEnvelope, and evaluation logic - Add Invariants section to system prompt with workflow instructions - Show rulespec/envelope file status in plan verification output - Rulespec written during planning (captures constraints from task) - Envelope written after implementation (documents what was built)	2026-02-04 20:49:58 +11:00
Dhanji R. Prasanna	a5f6475603	feat: implement Agent Skills specification support Implements the Agent Skills specification (https://agentskills.io) for portable skill packages that give the agent new capabilities. Changes: - Add skills module with SKILL.md parser (YAML frontmatter + markdown body) - Implement skill discovery from ~/.g3/skills/, config extra_paths, and .g3/skills/ - Generate <available_skills> XML for system prompt injection - Add SkillsConfig to g3-config with enabled flag and extra_paths - Wire skills discovery into CLI startup - Add 29 unit tests for parser, discovery, and prompt generation - Update README with Agent Skills documentation Skill locations (priority order): 1. ~/.g3/skills/ (global) 2. Config extra_paths 3. .g3/skills/ (workspace, highest priority) At startup, g3 scans skill directories and injects a summary into the system prompt. When the agent needs a skill, it reads the full SKILL.md using the read_file tool.	2026-02-04 12:58:57 +11:00
Dhanji R. Prasanna	e893794029	Rename /feature command to /plan - Update command matching from /feature to /plan in commands.rs - Update help text, usage message, and example - Update workspace memory references - /feature is no longer recognized (completely removed)	2026-02-02 16:00:09 +11:00
Dhanji R. Prasanna	d6b7177107	Implement plan_verify() for deterministic evidence validation Adds a verification system that checks evidence in completed plan items: - Evidence parsing: supports code locations (file:line, file:line-line, file only) and test references (file::test_name) - Code location verification: checks file exists, validates line numbers in range - Test reference verification: checks test file exists, searches for fn pattern - Verification results: Verified, Warning, Error, Skipped statuses - Loud output formatting with emoji indicators for warnings/errors - Integration with execute_plan_write(): runs when plan is complete and approved - 12 new unit tests covering parsing and verification Warnings are advisory (don't block), errors are loud but also don't block. Blocked items are skipped during verification.	2026-02-02 15:15:03 +11:00
Dhanji R. Prasanna	a63950d8f5	Add Plan Mode to replace TODO system Plan Mode is a cognitive forcing system that requires reasoning about: - Happy path - Negative case - Boundary condition New tools: - plan_read: Read current plan for session - plan_write: Create/update plan with YAML content (validates structure) - plan_approve: Mark current revision as approved New command: - /feature <description>: Start Plan Mode for a new feature Plan schema requires: - plan_id, revision, approved_revision - items with id, description, state, touches, checks (happy/negative/boundary) - evidence and notes required when marking items done Verification: - plan_verify() called automatically when all items are done/blocked Removed: - todo_read, todo_write tools - todo.rs module and related tests	2026-02-02 14:38:25 +11:00
Dhanji R. Prasanna	5ab1598e03	feat: async research tool - runs in background, returns immediately The research tool now spawns the scout agent in a background tokio task and returns immediately with a research_id placeholder. This allows the agent to continue working while research runs (30-120 seconds). Key changes: - New PendingResearchManager for tracking async research tasks - research tool returns immediately with placeholder containing research_id - research_status tool to check progress of pending research - Auto-injection of completed research at natural break points: - Start of each tool iteration (before LLM call) - Before prompting user in interactive mode - /research CLI command to list all research tasks - Updated system prompt to explain async behavior The agent can: - Continue with other work while research runs - Check status with research_status tool - Yield turn to user if results are critical before continuing	2026-01-30 13:00:02 +11:00
Dhanji R. Prasanna	cb3c523edf	Compact workspace memory: -7.5% size, all concepts preserved Transformations applied: - Fixed incorrect line numbers in Streaming Utilities (IterationState 65→166, StreamingState 17→16) - Updated file sizes with verified byte counts (context_window.rs, streaming.rs, compaction.rs, acd.rs) - Tightened verbose descriptions throughout - Removed redundant "Format" column from Chat Template table - Shortened download command (python3 -m huggingface_hub... → huggingface-cli) - Collapsed "Known issues" log-style narrative in Embedded Provider - Removed filler words and redundant explanations Metrics: 224→212 lines (-5%), 12581→11630 chars (-7.5%) All 26 semantic entries preserved. Agent: huffman	2026-01-29 11:38:53 +11:00
Dhanji R. Prasanna	653c5f72ac	Compact workspace memory: 402→224 lines (-44%), 22k→12.6k chars (-43%) Merged duplicate entries: - Context Window & Compaction + Context Compaction → unified section - Streaming Markdown Formatter + Code Blocks → single entry - CLI Argument Parsing + CLI Entry Points + CLI Module Structure → CLI Module Structure - Auto-Memory Feature + Tool Call Tracking + Auto-Memory Reminder Format → Auto-Memory System - Agent Mode folded into CLI Module Structure Tightened verbose sections: - UTF-8 pattern: removed 10-line code example, kept pattern + danger zones - ACD Fragment Storage: replaced 15-line JSON with inline field list - GLM-4 downloads: replaced 12-line bash with table + single download template Entry count: 37 → 26 (-30%) All char ranges, function names, and gotchas preserved. Agent: huffman	2026-01-29 11:34:17 +11:00
Dhanji R. Prasanna	a902be1562	Refactor system prompts to eliminate duplication; upgrade embedded provider - Refactor prompts.rs: extract shared sections (intro, TODO, workspace memory, web research, response guidelines) used by both native and non-native prompts - Fix typo in native prompt: "save them.." -> "save them." - Fix non-native prompt: add missing closing braces in JSON examples, add IMPORTANT steps section, align with native prompt quality - Add 9 unit tests to verify both prompts contain required sections - Upgrade llama-cpp-2 dependency and refactor embedded provider - Update config.example.toml with embedded model examples - Update workspace memory	2026-01-28 09:56:39 +11:00
Dhanji R. Prasanna	5b4079e861	Add prompt cache statistics tracking to /stats command - Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files	2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna	a34a3b08e9	Rename Project Memory to Workspace Memory Rename all references from "Project Memory" to "Workspace Memory" to avoid future conflation if a "project" concept is introduced later. Changes: - Rename read_project_memory() -> read_workspace_memory() - Update all prompts, tool descriptions, and comments - Update header parsing in memory.rs to use "# Workspace Memory" - Update display detection for "=== Workspace Memory ===" - Update documentation and analysis/memory.md 11 files changed, ~36 occurrences updated.	2026-01-21 14:08:42 +05:30
Dhanji R. Prasanna	1f5eff15e5	Updating memory for streaming structs	2026-01-20 15:47:43 +05:30
Dhanji R. Prasanna	168cfff2ed	refactor(g3-core): extract tool output formatting to streaming.rs Centralize tool output formatting logic that was duplicated/scattered in stream_completion_with_tools(). This eliminates code-path aliasing where tool type checks were done in multiple places. Changes: - Add ToolOutputFormat enum (SelfHandled, Compact, Regular) - Add format_tool_result_summary() for centralized formatting decisions - Add is_compact_tool() and is_self_handled_tool() helper functions - Move parse_diff_stats() from lib.rs to streaming.rs - Simplify tool execution display logic in lib.rs using new helpers Net effect: -86 lines in lib.rs, +112 lines in streaming.rs The streaming.rs additions are reusable, well-named functions. All 585+ workspace tests pass. Agent: fowler	2026-01-20 15:45:35 +05:30
Dhanji R. Prasanna	9abb3735d2	refactor(g3-core): use StreamingState and IterationState structs in stream_completion_with_tools Consolidate scattered state variables in the 834-line stream_completion_with_tools() function to use the existing StreamingState and IterationState structs from streaming.rs. This eliminates code-path aliasing where state was tracked in multiple places and makes the streaming loop easier to reason about. Changes: - Add assistant_message_added field to StreamingState - Add stream_stop_reason field to IterationState - Replace 8 inline state variables with StreamingState::new() - Replace 7 iteration-local variables with IterationState::new() - All 585 workspace tests pass This is a pure refactor with no behavior changes. The state structs were already defined in streaming.rs but not used in the main streaming loop. Agent: fowler	2026-01-20 15:05:23 +05:30
Dhanji R. Prasanna	dec22f5e58	refactor(g3-cli): extract commands module and fix test organization - Extract handle_command() from interactive.rs to new commands.rs module (320 lines, 15 match arms for /help, /compact, /thinnify, etc.) - Fix orphaned tests in completion.rs that were outside mod tests block - Add #[allow(dead_code)] to with_include_prompt_filename() (used in tests) - interactive.rs reduced from 595 to 290 lines Agent: fowler	2026-01-20 14:30:50 +05:30
Dhanji R. Prasanna	182f5f98fe	Centralize g3 status message formatting Extract a new g3_status module in g3-cli that provides consistent formatting for all 'g3:' prefixed system status messages. Key changes: - Add G3Status struct with methods for progress, done, failed, error, etc. - Add Status enum with Done, Failed, Error, Resolved, Insufficient, NoChanges - Add ThinResult struct in g3-core for semantic thinning data - Update UiWriter trait with print_thin_result() method - Refactor context thinning to return ThinResult instead of formatted strings - Update all callers to use the new centralized formatting - Session resume/decline messages now use G3Status - Compaction status messages now use G3Status This maintains clean separation of concerns: g3-core emits semantic data, g3-cli handles all terminal formatting and colors.	2026-01-20 09:50:55 +05:30
Dhanji R. Prasanna	5caa101b84	Fix inline JSON being incorrectly detected as tool call The bug was caused by mark_tool_calls_consumed() being called after displaying each chunk, which advanced last_consumed_position to the end of the current buffer. When the next chunk arrived with JSON, the unchecked_buffer started at position 0 of the slice, causing is_on_own_line() to return true (position 0 is always "on its own line"). Removed the problematic mark_tool_calls_consumed() call from the "no tool executed" branch. The remaining call after actual tool execution is correct and necessary. Added integration test that verifies inline JSON in prose is not detected as a tool call.	2026-01-19 14:35:01 +05:30
Dhanji R. Prasanna	85ea8fe69c	Update project memory with agent-specific language prompts Document the new agent+language prompt injection feature including: - AGENT_LANGUAGE_PROMPTS static array location - get_agent_language_prompt() and get_agent_language_prompts_for_workspace_with_langs() - File naming pattern: prompts/langs/<agent>.<lang>.md - Instructions for adding new agent+lang prompts	2026-01-15 06:43:42 +05:30
Dhanji R. Prasanna	afec65fd50	Add language-specific prompt injection for toolchain guidance - Add language_prompts module that auto-detects programming languages in workspace - Scan for language files with depth limit (2) to inject relevant toolchain prompts - Add prompts/langs/ directory for language-specific markdown files - Include Racket/raco toolchain guidance as first language prompt - Update combine_project_content() to accept language_content parameter - Integrate language detection into main CLI flow and agent mode - Update project memory with new feature documentation	2026-01-14 21:00:52 +05:30
Dhanji R. Prasanna	a1dfd9c0b6	Enhanced auto-memory with rich few-shot format - Updated memory reminder prompt with per-symbol char ranges - Added two few-shot examples: Session Continuation (feature) + UTF-8 Safe Slicing (pattern) - Updated system prompt Memory Format section to match - Format: file -> nested symbols with [start..end] ranges and descriptions - Enables direct read_file navigation to specific functions	2026-01-13 21:49:48 +05:30
Dhanji R. Prasanna	9a3b03a41f	Remove flock mode (superseded by studio) Flock mode has been superseded by the studio multi-agent workspace manager. Changes: - Remove g3-ensembles crate entirely - Remove --project, --flock-workspace, --segments, --flock-max-turns CLI flags - Remove run_flock_mode() from autonomous.rs - Remove flock-related tests from cli_integration_test.rs - Update README.md, docs/architecture.md, analysis/memory.md - Delete docs/FLOCK_MODE.md	2026-01-13 15:01:12 +05:30
Dhanji R. Prasanna	81ea149369	Fix confusing documentation references 1. architecture.md: Fixed diagram to show 'studio' instead of 'g3-console' (the crate was renamed during development) 2. analysis/memory.md: Removed reference to non-existent machine_ui_writer.rs 3. theme.rs: Clarified that 'retro' is a theme option (the default theme), not a separate TUI mode. No --retro CLI flag exists.	2026-01-12 20:49:37 +05:30
Dhanji R. Prasanna	43a5d27149	Add compact format for remember, take_screenshot, code_coverage, rehydrate Extend compact single-line output to additional tools: - remember: shows '📝 memory updated (size)' - take_screenshot: shows '📸 path' - code_coverage: shows '📊 report generated' - rehydrate: shows '🔄 restored fragment_id' Tools without file_path argument use simplified format: ● tool_name \| summary \| tokens ◉ time	2026-01-12 14:45:50 +05:30
Dhanji R. Prasanna	33558bc092	Update project memory with new location documentation	2026-01-12 11:25:59 +05:30
Dhanji R. Prasanna	d508ddd508	Move project memory from .g3/ to analysis/ for version control Project memory is now stored at analysis/memory.md instead of .g3/memory.md. This change enables: - Shared memory across git worktrees (studio agent sessions) - Version-controlled memory that persists across clones - Memory changes tracked in git history and reviewable in PRs Changes: - crates/g3-core/src/tools/memory.rs: Update get_memory_path() to use analysis/ - crates/g3-cli/src/project_files.rs: Update read_project_memory() path - crates/g3-core/src/prompts.rs: Update documentation references (2 occurrences) - analysis/memory.md: Add memory file (copied from .g3/memory.md)	2026-01-12 10:20:33 +05:30

37 Commits