alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	5428504777	Fix input formatting bugs: newline, line wrapping, and TTY check Fixes three bugs in the input formatter introduced in `4e16942`: 1. Bug 2 & 3 (missing newline, line duplication): - Changed print! to println! to add trailing newline - Calculate visual lines based on terminal width instead of logical line count, fixing duplication for wrapped lines 2. Bug 1 (^M on non-interactive prompts): - Added TTY check to skip formatting when stdout is not a terminal - Prevents terminal state corruption for stdin prompts	2026-01-30 13:28:31 +11:00
Dhanji R. Prasanna	b252ff443d	Merge sessions/interactive/9681cb67	2026-01-30 13:01:00 +11:00
Dhanji R. Prasanna	5ab1598e03	feat: async research tool - runs in background, returns immediately The research tool now spawns the scout agent in a background tokio task and returns immediately with a research_id placeholder. This allows the agent to continue working while research runs (30-120 seconds). Key changes: - New PendingResearchManager for tracking async research tasks - research tool returns immediately with placeholder containing research_id - research_status tool to check progress of pending research - Auto-injection of completed research at natural break points: - Start of each tool iteration (before LLM call) - Before prompting user in interactive mode - /research CLI command to list all research tasks - Updated system prompt to explain async behavior The agent can: - Continue with other work while research runs - Check status with research_status tool - Yield turn to user if results are critical before continuing	2026-01-30 13:00:02 +11:00
Dhanji R. Prasanna	4e1694248f	Add input formatting for interactive CLI When users type prompts in interactive mode, the input is now reformatted in place with enhanced highlighting: - ALL CAPS words (2+ chars) become bold green (e.g., FIX, BUG, HTTP2) - Quoted text ("..." or ...) becomes cyan - Standard markdown formatting is also supported New module: input_formatter.rs with 10 unit tests Integrated into interactive.rs for both single-line and multiline input	2026-01-30 12:03:36 +11:00
Dhanji R. Prasanna	2e21502357	Fix --project flag not working in agent mode - Add CommonFlags struct to group flags that apply across all modes - Refactor run_agent_mode() to accept CommonFlags instead of individual params - Add project loading logic for agent chat mode - Add integration tests for --project with agent mode This refactor prevents future bugs where new flags work in one mode but are forgotten in another.	2026-01-30 11:28:48 +11:00
Dhanji R. Prasanna	51d22b3282	gemini model perf	2026-01-30 10:09:46 +11:00
Dhanji R. Prasanna	8191a5e8e6	feat(embedded): add GLM tool format adapter for code fence stripping GLM-4 models wrap tool calls in markdown code fences and inline backticks, which prevents the streaming parser from detecting them. This adapter: - Strips ```json and ``` code fence markers during streaming - Strips inline backticks from tool call JSON - Handles chunked streaming correctly (buffers potential fence lines) - Transforms GLM native format (<\|assistant\|>tool_name) to g3 JSON format Also refactors embedded provider into module structure: - embedded/mod.rs - module exports - embedded/provider.rs - main EmbeddedProvider (moved from embedded.rs) - embedded/adapters/mod.rs - ToolFormatAdapter trait - embedded/adapters/glm.rs - GLM-specific adapter Includes 22 unit tests covering edge cases like nested JSON in strings, chunk boundary handling, and false pattern detection. Updates README to show GLM-4 9B now works (⭐⭐) for agentic tasks.	2026-01-29 12:52:09 +11:00
Dhanji R. Prasanna	457ba35f80	docs: Fix documentation accuracy and add missing Gemini provider Corrections made: - docs/architecture.md: Fix crate count from 9 to 8 (actual count) - docs/tools.md: Fix code_search supported languages (kotlin -> haskell, scheme, racket) - docs/CODE_SEARCH.md: Add missing Haskell and Scheme to supported languages list - docs/providers.md: Add complete Gemini provider documentation section - docs/configuration.md: Add Gemini configuration section The Gemini provider (crates/g3-providers/src/gemini.rs) was fully implemented but not documented. The code_search tool actually supports haskell and scheme (via tree-sitter) but documentation incorrectly listed kotlin. Agent: lamport	2026-01-29 12:06:53 +11:00
Dhanji R. Prasanna	f9e0b94cc1	tiny tweak	2026-01-29 12:02:11 +11:00
Dhanji R. Prasanna	853237e62e	Update dependency analysis artifacts Generated comprehensive static dependency analysis for g3 workspace: - graph.json: 108 nodes (9 crates, 99 files), 186 edges - graph.summary.md: Overview with metrics, entrypoints, fan-in/fan-out rankings - sccs.md: No cycles detected (DAG structure confirmed) - layers.observed.md: 4-layer crate hierarchy identified - hotspots.md: ui_writer.rs (15 fan-in), agent_mode.rs (13 fan-out) as key nodes - limitations.md: Documents extraction methodology and caveats Updated AGENTS.md with artifact documentation table. Agent: euler	2026-01-29 11:46:39 +11:00
Dhanji R. Prasanna	cba7d31996	Merge sessions/carmack/ee92b215	2026-01-29 11:40:48 +11:00
Dhanji R. Prasanna	d4941dc95a	refactor(providers): improve readability of embedded.rs and gemini.rs embedded.rs (937→789 lines, -16%): - Extract duplicated inference setup into prepare_context() helper - Extract stop sequence handling into find_stop_sequence() and truncate_at_stop_sequence() - Add InferenceParams struct to consolidate request parameter extraction - Add clear section markers for code organization - Tests now use module-level format functions directly (no duplication) gemini.rs: - Extract common request building into build_request() method - Reduces duplication between complete() and stream() methods All 399 unit tests pass. Behavior unchanged. Agent: carmack	2026-01-29 11:39:46 +11:00
Dhanji R. Prasanna	cb3c523edf	Compact workspace memory: -7.5% size, all concepts preserved Transformations applied: - Fixed incorrect line numbers in Streaming Utilities (IterationState 65→166, StreamingState 17→16) - Updated file sizes with verified byte counts (context_window.rs, streaming.rs, compaction.rs, acd.rs) - Tightened verbose descriptions throughout - Removed redundant "Format" column from Chat Template table - Shortened download command (python3 -m huggingface_hub... → huggingface-cli) - Collapsed "Known issues" log-style narrative in Embedded Provider - Removed filler words and redundant explanations Metrics: 224→212 lines (-5%), 12581→11630 chars (-7.5%) All 26 semantic entries preserved. Agent: huffman	2026-01-29 11:38:53 +11:00
Dhanji R. Prasanna	1bff9b0025	huffman tweak to cover more ground	2026-01-29 11:36:09 +11:00
Dhanji R. Prasanna	653c5f72ac	Compact workspace memory: 402→224 lines (-44%), 22k→12.6k chars (-43%) Merged duplicate entries: - Context Window & Compaction + Context Compaction → unified section - Streaming Markdown Formatter + Code Blocks → single entry - CLI Argument Parsing + CLI Entry Points + CLI Module Structure → CLI Module Structure - Auto-Memory Feature + Tool Call Tracking + Auto-Memory Reminder Format → Auto-Memory System - Agent Mode folded into CLI Module Structure Tightened verbose sections: - UTF-8 pattern: removed 10-line code example, kept pattern + danger zones - ACD Fragment Storage: replaced 15-line JSON with inline field list - GLM-4 downloads: replaced 12-line bash with table + single download template Entry count: 37 → 26 (-30%) All char ranges, function names, and gotchas preserved. Agent: huffman	2026-01-29 11:34:17 +11:00
Dhanji R. Prasanna	bd4473b75f	model performance tweaks to readme	2026-01-29 11:31:29 +11:00
Dhanji R. Prasanna	1bff9d5dcc	tiny tweaks to huffman	2026-01-29 11:31:17 +11:00
Dhanji R. Prasanna	7cf9c3b7bb	Merge sessions/hopper/8e287188	2026-01-29 11:30:54 +11:00
Dhanji R. Prasanna	21f8d5a1aa	Add integration tests for CacheStats and Gemini serialization Agent: hopper Added two new integration test files: 1. cache_stats_integration_test.rs (g3-core) - Tests CacheStats accumulation through streaming completion flow - Verifies cache hit detection (cache_read_tokens > 0) - Tests multi-request accumulation of cache statistics - Verifies cache efficiency and hit rate calculations - Uses MockProvider to simulate provider usage data 2. gemini_serialization_test.rs (g3-providers) - Tests Gemini API message format conversion - Verifies system messages become system_instruction - Verifies assistant role maps to "model" (Gemini terminology) - Tests tool conversion to function_declarations format - Characterizes multi-system-message behavior (last wins) Both test files follow blackbox/integration testing principles: - Test observable behavior through stable surfaces - Do not assert internal implementation details - Include documentation of what is/is not asserted	2026-01-29 11:28:52 +11:00
Dhanji R. Prasanna	570a824780	Rename archivist agent to huffman Named after David Huffman, inventor of Huffman coding - compression that preserves information with fewer bits. Fits the agent's purpose: compact memory, preserve semantics.	2026-01-29 11:22:59 +11:00
Dhanji R. Prasanna	627dd45966	Add archivist to built-in agents list in README	2026-01-29 11:20:23 +11:00
Dhanji R. Prasanna	b45ff37b68	Add archivist agent for memory compaction and signal optimization New agent that maintains workspace memory quality: - Deduplicates entries within memory - Tightens verbose phrasing to terse declarations - Collapses log-style narratives to current-state facts - Removes AGENTS.md ↔ Memory duplication - Ports code locations from AGENTS.md to Memory Goal: increase signal, reduce noise, preserve all semantic information. Agent: archivist	2026-01-29 11:19:47 +11:00
Dhanji R. Prasanna	56f558dc1b	Fix compiler warnings in test files Eliminate unused variable and import warnings across test files: - streaming_parser_test.rs: prefix unused `tools` with underscore - webdriver_session.rs: remove unused `use super::*` import - mock_provider_integration_test.rs: prefix unused `result` and `task_result` - test_preflight_max_tokens.rs: prefix unused `proposed_max` - todo_staleness_test.rs: add #[allow(dead_code)] for test helper methods - json_parsing_stress_test.rs: prefix unused `tools` - read_file_token_limit_test.rs: add #[allow(dead_code)] for unused helper - background_process_demo_test.rs: remove unused PathBuf import - test_session_continuation.rs: prefix unused `temp_dir` in 7 tests All tests pass. No behavior changes. Agent: fowler	2026-01-29 11:15:10 +11:00
Dhanji R. Prasanna	5c1e0630b5	Merge sessions/interactive/664ee473	2026-01-29 11:14:28 +11:00
Dhanji R. Prasanna	9a998e201a	Tighten AGENTS.md: remove redundant content covered by Memory Removed sections that duplicate Workspace Memory: - Recommended Entry Points (Memory has precise file/line locations) - For Debugging paths (Memory has session/error log details) - Dependency Analysis Artifacts (reference info, not actionable) Kept essential guardrails: - Critical Invariants (MUST/MUST NOT rules) - Dangerous Code Paths (risk warnings, not locations) - Do/Dont coding standards - Common Incorrect Assumptions Reduction: 125 lines → 69 lines (~45% smaller, ~650 tokens saved)	2026-01-29 11:13:25 +11:00
Dhanji R. Prasanna	7bfb9efa19	Remove automatic README loading from context window README.md is no longer auto-loaded into the LLM context at startup. This saves ~4,600 tokens per session while AGENTS.md and memory.md still provide all critical information for code tasks. Changes: - Delete read_project_readme() function - Remove readme_content parameter from combine_project_content() - Rename extract_readme_heading() -> extract_project_heading() - Rename Agent constructors: _with_readme_ -> _with_project_context_ - Update context preservation to only check for Agent Configuration - Remove has_readme field from LoadedContent - Update all tests to use new markers and function names The LLM can still read README.md on-demand via read_file when needed.	2026-01-29 11:07:41 +11:00
Dhanji R. Prasanna	5ea43d7b39	Add --project CLI flag for loading projects at startup Adds a new --project <PATH> flag that loads project files (brief.md, contacts.yaml, status.md) at startup, similar to the /project command but WITHOUT auto-executing the project status prompt. Changes: - Add --project flag to cli_args.rs - Add load_and_validate_project() helper in project.rs (shared by both --project flag and /project command) - Modify run_interactive() to accept optional initial_project parameter - Wire up --project in lib.rs to load project before interactive mode - Refactor /project command to use shared helper (reduces duplication) - Add 4 new tests for load_and_validate_project()	2026-01-29 11:06:08 +11:00
Dhanji R. Prasanna	05d253ee2a	docs: add embedded model performance comparison for agentic tasks Added a new section documenting local LLM performance on complex agentic tasks (comic book repacking test case). Includes: - Cloud model baseline (Claude Opus 4.5, Sonnet 4.5, Claude 4 family) - Local model ratings (Qwen3-32B, Qwen3-14B, GLM-4 9B, Qwen3-4B) - Key findings about MoE vs dense models - Configuration example for embedded providers	2026-01-29 10:33:53 +11:00
Dhanji R. Prasanna	f6717b4435	Add Gemini 3 model context window detection	2026-01-29 10:20:56 +11:00
Dhanji R. Prasanna	735e9c9312	Add Google Gemini provider support - Add GeminiProvider with streaming and native tool calling - Support gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-pro/flash models - Model-specific context window detection (1M-2M tokens) - Message conversion: assistant -> model role mapping - System messages extracted to system_instruction field - Tool schema conversion with functionCall/functionResponse parts - SSE streaming with JSON array buffer parsing - 8 unit tests for conversion and parsing logic - Register provider in g3-core and validate in g3-cli	2026-01-29 10:11:42 +11:00
Dhanji R. Prasanna	fe33568ee0	Fix embedded provider max_tokens default (2048 -> 8192) The resolve_max_tokens() function was returning 2048 for embedded providers, which caused responses to be truncated prematurely. Increased to 8192 to allow the provider's own effective_max_tokens() calculation to work properly.	2026-01-28 13:58:14 +11:00
Dhanji R. Prasanna	58fe74334d	Auto-detect context window size from GGUF for embedded providers - Add context_window_size() method to LLMProvider trait - Implement for EmbeddedProvider to return the auto-detected context length - Update Agent to query provider directly instead of using hardcoded defaults - Removes need for model-specific context length mappings	2026-01-28 11:16:14 +11:00
Dhanji R. Prasanna	55dba121b7	Add GLM-4 to context length defaults (32k) GLM-4 models support 32k context but were falling back to the conservative 4096 default, causing context overflow on startup.	2026-01-28 10:46:36 +11:00
Dhanji R. Prasanna	e32c302023	Fix embedded provider initialization and logging - Use global OnceLock for llama.cpp backend to prevent BackendAlreadyInitialized error - Suppress verbose llama.cpp stderr logging during model loading - Fix provider validation to accept "embedded.name" format (extract type before dot)	2026-01-28 10:33:10 +11:00
Dhanji R. Prasanna	ba6e1f9896	Remove unused code to eliminate build warnings - Remove unused SYSTEM_PROMPT_FOR_NATIVE_TOOL_USE and SYSTEM_PROMPT_FOR_NON_NATIVE_TOOL_USE constants - Remove unused gpu_layers field from EmbeddedProvider struct - Remove unused clean_stop_sequences method from EmbeddedProvider	2026-01-28 10:01:44 +11:00
Dhanji R. Prasanna	a902be1562	Refactor system prompts to eliminate duplication; upgrade embedded provider - Refactor prompts.rs: extract shared sections (intro, TODO, workspace memory, web research, response guidelines) used by both native and non-native prompts - Fix typo in native prompt: "save them.." -> "save them." - Fix non-native prompt: add missing closing braces in JSON examples, add IMPORTANT steps section, align with native prompt quality - Add 9 unit tests to verify both prompts contain required sections - Upgrade llama-cpp-2 dependency and refactor embedded provider - Update config.example.toml with embedded model examples - Update workspace memory	2026-01-28 09:56:39 +11:00
Dhanji R. Prasanna	585684a86e	Fix dead_code warning in studio crate - Add #[allow(dead_code)] to GitWorktree::list() method	2026-01-27 13:09:56 +11:00
Dhanji R. Prasanna	755acabd47	Highlight command argument completions in cyan - /run path completions shown in cyan - /resume session ID completions shown in cyan - /project name completions shown in cyan	2026-01-27 12:45:37 +11:00
Dhanji R. Prasanna	8389b0d652	Add TAB autocompletion for /project command - Complete project names from ~/projects/ directory - Display shows project name, replacement uses ~/projects/<name> path - Projects sorted alphabetically - Added test for project completion	2026-01-27 12:43:24 +11:00
Dhanji R. Prasanna	cdb8b0f5eb	refactor(g3-core): consolidate Agent construction into single canonical path Eliminate code-path aliasing in Agent construction methods by introducing a single `build_agent()` helper that all constructors delegate to. Before: 3 nearly-identical `Ok(Self { ... })` blocks (~30 lines each) with subtle differences in auto_compact, is_autonomous, quiet, and computer_controller fields - prone to drift over time. After: Single canonical `build_agent()` method that constructs Agent with all fields. All public constructors delegate to this single path: - new_for_test() -> new_for_test_with_readme() -> build_agent() - new_with_mode_and_readme() -> build_agent() Changes: - Add `build_agent()` private helper method (single source of truth) - Simplify `new_for_test()` to delegate to `new_for_test_with_readme()` - Update `new_for_test_with_readme()` to use `build_agent()` - Update `new_with_mode_and_readme()` to use `build_agent()` Net reduction: ~43 lines (-109/+66) All 190 tests pass. Agent: fowler	2026-01-27 12:01:12 +11:00
Dhanji R. Prasanna	ffea6b5fac	Tighten fowler prompt	2026-01-27 11:54:21 +11:00
Dhanji R. Prasanna	dfa0e4bfa2	refactor(g3-core): add section markers to lib.rs for better organization Added clear section comments to organize the 3000-line lib.rs into logical groupings: - CONSTRUCTION METHODS (~line 159) - CONFIGURATION & PROVIDER RESOLUTION (~line 444) - TASK EXECUTION (~line 782) - SESSION MANAGEMENT (~line 1069) - CONTEXT WINDOW OPERATIONS (~line 1148) - STREAMING & LLM INTERACTION (~line 1563) - TOOL EXECUTION (~line 2825) This improves code navigation and provides clear boundaries for future extraction into separate modules. No behavioral changes - all 191 tests pass. Agent: fowler	2026-01-27 11:46:17 +11:00
Dhanji R. Prasanna	5b4079e861	Add prompt cache statistics tracking to /stats command - Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files	2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna	96899230a4	Tweak hopper to encourage mocks and stubbing	2026-01-27 10:44:48 +11:00
Dhanji R. Prasanna	2e84f1ece0	test: fix ACD test race condition and add read_image characterization test - Fix test_rehydrate_success race condition by using UUID for unique session IDs - Add #[serial] attribute to prevent parallel execution conflicts - Improve cleanup to remove entire session directory tree - Add characterization test for resize_image_to_dimensions fallback behavior (documents fix from commit `af8b849` for media type preservation) Agent: hopper	2026-01-26 16:19:53 +11:00
Dhanji R. Prasanna	726e2d71f5	test: add integration test for project content surviving compaction Add test_project_content_survives_compaction() to verify that project content loaded via /project command persists through context compaction. This is a CHARACTERIZATION test that validates: - Project content appended to README message survives compaction - The README message (containing project content) is preserved as message[1] - PROJECT INSTRUCTIONS, ACTIVE PROJECT markers, Brief and Status sections all survive the compaction process Agent: hopper	2026-01-26 16:09:17 +11:00
Dhanji R. Prasanna	d6a986ce0f	refactor(cli): extract execute_user_input() to eliminate duplication Both multiline and single-line input paths in interactive.rs had identical code for: - Template processing (process_template) - Task execution (execute_task_with_retry) - Auto-memory reminder with error handling Extracted to a single execute_user_input() helper function that handles all three steps. This eliminates code-path aliasing where the two paths could drift over time. File reduced from 401 to 393 lines (-2%). All 106 g3-cli tests pass. Agent: fowler	2026-01-26 15:59:55 +11:00
Dhanji R. Prasanna	57f04a77aa	Add template expansion to interactive prompts Apply {{today}} and other template variables to user input in: - Interactive mode (single and multiline) - Accumulative mode requirements	2026-01-26 15:43:39 +11:00
Dhanji R. Prasanna	7806897f00	Expand {{today}} to include day of week: YYYY-MM-DD (Monday)	2026-01-26 15:29:47 +11:00
Dhanji R. Prasanna	9de8e8cc76	Fix compaction bug: use User role for summary to maintain alternation The previous implementation added the summary as a System message, which caused "Conversation must start with a user message" errors because the first non-system message after compaction was Assistant (the preserved last assistant message). Fix: Change summary from System to User message, creating valid alternation: [System Prompt] -> [Summary as USER] -> [Last Assistant] -> [Latest User] This also prevents system message bloat across multiple compactions since the summary is now part of the conversation flow and gets replaced on each compaction. Added test_second_compaction_no_bloat to verify no accumulation.	2026-01-26 15:24:04 +11:00

1 2 3 4 5 ...

725 Commits