alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	d4941dc95a	refactor(providers): improve readability of embedded.rs and gemini.rs embedded.rs (937→789 lines, -16%): - Extract duplicated inference setup into prepare_context() helper - Extract stop sequence handling into find_stop_sequence() and truncate_at_stop_sequence() - Add InferenceParams struct to consolidate request parameter extraction - Add clear section markers for code organization - Tests now use module-level format functions directly (no duplication) gemini.rs: - Extract common request building into build_request() method - Reduces duplication between complete() and stream() methods All 399 unit tests pass. Behavior unchanged. Agent: carmack	2026-01-29 11:39:46 +11:00
Dhanji R. Prasanna	21f8d5a1aa	Add integration tests for CacheStats and Gemini serialization Agent: hopper Added two new integration test files: 1. cache_stats_integration_test.rs (g3-core) - Tests CacheStats accumulation through streaming completion flow - Verifies cache hit detection (cache_read_tokens > 0) - Tests multi-request accumulation of cache statistics - Verifies cache efficiency and hit rate calculations - Uses MockProvider to simulate provider usage data 2. gemini_serialization_test.rs (g3-providers) - Tests Gemini API message format conversion - Verifies system messages become system_instruction - Verifies assistant role maps to "model" (Gemini terminology) - Tests tool conversion to function_declarations format - Characterizes multi-system-message behavior (last wins) Both test files follow blackbox/integration testing principles: - Test observable behavior through stable surfaces - Do not assert internal implementation details - Include documentation of what is/is not asserted	2026-01-29 11:28:52 +11:00
Dhanji R. Prasanna	56f558dc1b	Fix compiler warnings in test files Eliminate unused variable and import warnings across test files: - streaming_parser_test.rs: prefix unused `tools` with underscore - webdriver_session.rs: remove unused `use super::*` import - mock_provider_integration_test.rs: prefix unused `result` and `task_result` - test_preflight_max_tokens.rs: prefix unused `proposed_max` - todo_staleness_test.rs: add #[allow(dead_code)] for test helper methods - json_parsing_stress_test.rs: prefix unused `tools` - read_file_token_limit_test.rs: add #[allow(dead_code)] for unused helper - background_process_demo_test.rs: remove unused PathBuf import - test_session_continuation.rs: prefix unused `temp_dir` in 7 tests All tests pass. No behavior changes. Agent: fowler	2026-01-29 11:15:10 +11:00
Dhanji R. Prasanna	5c1e0630b5	Merge sessions/interactive/664ee473	2026-01-29 11:14:28 +11:00
Dhanji R. Prasanna	7bfb9efa19	Remove automatic README loading from context window README.md is no longer auto-loaded into the LLM context at startup. This saves ~4,600 tokens per session while AGENTS.md and memory.md still provide all critical information for code tasks. Changes: - Delete read_project_readme() function - Remove readme_content parameter from combine_project_content() - Rename extract_readme_heading() -> extract_project_heading() - Rename Agent constructors: _with_readme_ -> _with_project_context_ - Update context preservation to only check for Agent Configuration - Remove has_readme field from LoadedContent - Update all tests to use new markers and function names The LLM can still read README.md on-demand via read_file when needed.	2026-01-29 11:07:41 +11:00
Dhanji R. Prasanna	5ea43d7b39	Add --project CLI flag for loading projects at startup Adds a new --project <PATH> flag that loads project files (brief.md, contacts.yaml, status.md) at startup, similar to the /project command but WITHOUT auto-executing the project status prompt. Changes: - Add --project flag to cli_args.rs - Add load_and_validate_project() helper in project.rs (shared by both --project flag and /project command) - Modify run_interactive() to accept optional initial_project parameter - Wire up --project in lib.rs to load project before interactive mode - Refactor /project command to use shared helper (reduces duplication) - Add 4 new tests for load_and_validate_project()	2026-01-29 11:06:08 +11:00
Dhanji R. Prasanna	f6717b4435	Add Gemini 3 model context window detection	2026-01-29 10:20:56 +11:00
Dhanji R. Prasanna	735e9c9312	Add Google Gemini provider support - Add GeminiProvider with streaming and native tool calling - Support gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-pro/flash models - Model-specific context window detection (1M-2M tokens) - Message conversion: assistant -> model role mapping - System messages extracted to system_instruction field - Tool schema conversion with functionCall/functionResponse parts - SSE streaming with JSON array buffer parsing - 8 unit tests for conversion and parsing logic - Register provider in g3-core and validate in g3-cli	2026-01-29 10:11:42 +11:00
Dhanji R. Prasanna	fe33568ee0	Fix embedded provider max_tokens default (2048 -> 8192) The resolve_max_tokens() function was returning 2048 for embedded providers, which caused responses to be truncated prematurely. Increased to 8192 to allow the provider's own effective_max_tokens() calculation to work properly.	2026-01-28 13:58:14 +11:00
Dhanji R. Prasanna	58fe74334d	Auto-detect context window size from GGUF for embedded providers - Add context_window_size() method to LLMProvider trait - Implement for EmbeddedProvider to return the auto-detected context length - Update Agent to query provider directly instead of using hardcoded defaults - Removes need for model-specific context length mappings	2026-01-28 11:16:14 +11:00
Dhanji R. Prasanna	55dba121b7	Add GLM-4 to context length defaults (32k) GLM-4 models support 32k context but were falling back to the conservative 4096 default, causing context overflow on startup.	2026-01-28 10:46:36 +11:00
Dhanji R. Prasanna	e32c302023	Fix embedded provider initialization and logging - Use global OnceLock for llama.cpp backend to prevent BackendAlreadyInitialized error - Suppress verbose llama.cpp stderr logging during model loading - Fix provider validation to accept "embedded.name" format (extract type before dot)	2026-01-28 10:33:10 +11:00
Dhanji R. Prasanna	ba6e1f9896	Remove unused code to eliminate build warnings - Remove unused SYSTEM_PROMPT_FOR_NATIVE_TOOL_USE and SYSTEM_PROMPT_FOR_NON_NATIVE_TOOL_USE constants - Remove unused gpu_layers field from EmbeddedProvider struct - Remove unused clean_stop_sequences method from EmbeddedProvider	2026-01-28 10:01:44 +11:00
Dhanji R. Prasanna	a902be1562	Refactor system prompts to eliminate duplication; upgrade embedded provider - Refactor prompts.rs: extract shared sections (intro, TODO, workspace memory, web research, response guidelines) used by both native and non-native prompts - Fix typo in native prompt: "save them.." -> "save them." - Fix non-native prompt: add missing closing braces in JSON examples, add IMPORTANT steps section, align with native prompt quality - Add 9 unit tests to verify both prompts contain required sections - Upgrade llama-cpp-2 dependency and refactor embedded provider - Update config.example.toml with embedded model examples - Update workspace memory	2026-01-28 09:56:39 +11:00
Dhanji R. Prasanna	585684a86e	Fix dead_code warning in studio crate - Add #[allow(dead_code)] to GitWorktree::list() method	2026-01-27 13:09:56 +11:00
Dhanji R. Prasanna	755acabd47	Highlight command argument completions in cyan - /run path completions shown in cyan - /resume session ID completions shown in cyan - /project name completions shown in cyan	2026-01-27 12:45:37 +11:00
Dhanji R. Prasanna	8389b0d652	Add TAB autocompletion for /project command - Complete project names from ~/projects/ directory - Display shows project name, replacement uses ~/projects/<name> path - Projects sorted alphabetically - Added test for project completion	2026-01-27 12:43:24 +11:00
Dhanji R. Prasanna	cdb8b0f5eb	refactor(g3-core): consolidate Agent construction into single canonical path Eliminate code-path aliasing in Agent construction methods by introducing a single `build_agent()` helper that all constructors delegate to. Before: 3 nearly-identical `Ok(Self { ... })` blocks (~30 lines each) with subtle differences in auto_compact, is_autonomous, quiet, and computer_controller fields - prone to drift over time. After: Single canonical `build_agent()` method that constructs Agent with all fields. All public constructors delegate to this single path: - new_for_test() -> new_for_test_with_readme() -> build_agent() - new_with_mode_and_readme() -> build_agent() Changes: - Add `build_agent()` private helper method (single source of truth) - Simplify `new_for_test()` to delegate to `new_for_test_with_readme()` - Update `new_for_test_with_readme()` to use `build_agent()` - Update `new_with_mode_and_readme()` to use `build_agent()` Net reduction: ~43 lines (-109/+66) All 190 tests pass. Agent: fowler	2026-01-27 12:01:12 +11:00
Dhanji R. Prasanna	dfa0e4bfa2	refactor(g3-core): add section markers to lib.rs for better organization Added clear section comments to organize the 3000-line lib.rs into logical groupings: - CONSTRUCTION METHODS (~line 159) - CONFIGURATION & PROVIDER RESOLUTION (~line 444) - TASK EXECUTION (~line 782) - SESSION MANAGEMENT (~line 1069) - CONTEXT WINDOW OPERATIONS (~line 1148) - STREAMING & LLM INTERACTION (~line 1563) - TOOL EXECUTION (~line 2825) This improves code navigation and provides clear boundaries for future extraction into separate modules. No behavioral changes - all 191 tests pass. Agent: fowler	2026-01-27 11:46:17 +11:00
Dhanji R. Prasanna	5b4079e861	Add prompt cache statistics tracking to /stats command - Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files	2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna	2e84f1ece0	test: fix ACD test race condition and add read_image characterization test - Fix test_rehydrate_success race condition by using UUID for unique session IDs - Add #[serial] attribute to prevent parallel execution conflicts - Improve cleanup to remove entire session directory tree - Add characterization test for resize_image_to_dimensions fallback behavior (documents fix from commit `af8b849` for media type preservation) Agent: hopper	2026-01-26 16:19:53 +11:00
Dhanji R. Prasanna	726e2d71f5	test: add integration test for project content surviving compaction Add test_project_content_survives_compaction() to verify that project content loaded via /project command persists through context compaction. This is a CHARACTERIZATION test that validates: - Project content appended to README message survives compaction - The README message (containing project content) is preserved as message[1] - PROJECT INSTRUCTIONS, ACTIVE PROJECT markers, Brief and Status sections all survive the compaction process Agent: hopper	2026-01-26 16:09:17 +11:00
Dhanji R. Prasanna	d6a986ce0f	refactor(cli): extract execute_user_input() to eliminate duplication Both multiline and single-line input paths in interactive.rs had identical code for: - Template processing (process_template) - Task execution (execute_task_with_retry) - Auto-memory reminder with error handling Extracted to a single execute_user_input() helper function that handles all three steps. This eliminates code-path aliasing where the two paths could drift over time. File reduced from 401 to 393 lines (-2%). All 106 g3-cli tests pass. Agent: fowler	2026-01-26 15:59:55 +11:00
Dhanji R. Prasanna	57f04a77aa	Add template expansion to interactive prompts Apply {{today}} and other template variables to user input in: - Interactive mode (single and multiline) - Accumulative mode requirements	2026-01-26 15:43:39 +11:00
Dhanji R. Prasanna	7806897f00	Expand {{today}} to include day of week: YYYY-MM-DD (Monday)	2026-01-26 15:29:47 +11:00
Dhanji R. Prasanna	9de8e8cc76	Fix compaction bug: use User role for summary to maintain alternation The previous implementation added the summary as a System message, which caused "Conversation must start with a user message" errors because the first non-system message after compaction was Assistant (the preserved last assistant message). Fix: Change summary from System to User message, creating valid alternation: [System Prompt] -> [Summary as USER] -> [Last Assistant] -> [Latest User] This also prevents system message bloat across multiple compactions since the summary is now part of the conversation flow and gets replaced on each compaction. Added test_second_compaction_no_bloat to verify no accumulation.	2026-01-26 15:24:04 +11:00
Dhanji R. Prasanna	83f68dae17	style: convert CLI status messages to G3Status format Convert remaining ✅ emoji status messages in g3-cli to use the consistent G3Status formatting system: - accumulative.rs: 'autonomous run ... [done]' - commands.rs /clear: 'clearing session ... [done]' - commands.rs /readme: 'reloading README ... [done/failed/error]' - commands.rs /unproject: 'unloading project ... [done]' This provides a consistent 'g3: action ... [status]' format across all CLI status messages.	2026-01-23 10:08:22 +05:30
Dhanji R. Prasanna	155db74aac	style: use G3Status formatting for agent mode completion message Change agent mode completion from '✅ Agent mode completed' to 'g3: <agent-name> session ... [done]' for consistency with other g3 status messages.	2026-01-23 10:04:05 +05:30
Dhanji R. Prasanna	5d0d532b47	feat: preserve last assistant message during compaction When context window compaction occurs, the last assistant message is now preserved in addition to the system prompt, README, and summary. This improves continuity after compaction by keeping the LLM's most recent response, which often contains important context about what was just done or what comes next. New message order after compaction: [System Prompt] -> [README/AGENTS.md] -> [ACD Stub?] -> [Summary] -> [Last Assistant] -> [Latest User?] Changes: - Add last_assistant_message field to PreservedMessages struct - Modify extract_preserved_messages() to find last assistant message - Modify reset_with_summary_and_stub() to include last assistant message - Add comprehensive integration tests using MockProvider Tests cover edge cases: - No assistant message exists - Tool-call-only assistant messages (still preserved) - Multiple assistant messages (only last one preserved) - No trailing user message	2026-01-23 09:54:03 +05:30
Dhanji R. Prasanna	dfdc21c3cf	Use G3Status formatting for /project loading message Changed from 'Project loaded: ✓ file1 ✓ file2' to 'g3: loading <project-name> .. ✓ file1 ✓ file2 .. [done]' - Add G3Status::loading_project() for consistent status formatting - Update /project command to use new formatting - Remove unused crossterm imports from commands.rs	2026-01-22 21:03:46 +05:30
Dhanji R. Prasanna	a488a6aa99	feat(cli): colorize project name in prompt via rustyline Highlighter Implement highlight_prompt() in G3Helper to colorize the project portion of the prompt in blue. This uses rustyline's proper mechanism for ANSI codes in prompts, which correctly handles cursor positioning. Prompt 'butler \| finances> ' now shows '\| finances>' in blue.	2026-01-22 10:48:17 +05:30
Dhanji R. Prasanna	067c69723b	fix(cli): use plain text prompt without ANSI colors ANSI color codes in rustyline prompts cause various issues: - \x01...\x02 markers break cursor movement - Separate prefix printing causes gaps or disappearing text Simplified to plain text prompt: 'butler \| finances> ' This ensures reliable cursor positioning and tab completion.	2026-01-22 10:27:27 +05:30
Dhanji R. Prasanna	cb1f99c41c	Revert "fix(cli): use '> ' as readline prompt when project active" This reverts commit `4d9399f737`.	2026-01-22 10:24:21 +05:30
Dhanji R. Prasanna	4d9399f737	fix(cli): use '> ' as readline prompt when project active Previously used empty string as readline prompt after printing colored prefix, which caused cursor positioning issues (large gap between project name and cursor). Now the prefix contains 'butler \| finances' (colored) and readline gets '> ' as its prompt, so cursor appears immediately after '> '.	2026-01-22 10:18:15 +05:30
Dhanji R. Prasanna	28dd60d4fc	fix(cli): separate colored prefix from readline prompt Rustyline's \x01...\x02 markers for ANSI codes didn't work correctly, causing cursor positioning issues and breaking line editing. New approach: build_prompt() returns (prefix, prompt) tuple where: - prefix: colored text printed before readline (contains ANSI codes) - prompt: plain text passed to readline (no ANSI codes) This ensures rustyline correctly calculates line length while still showing the colored project name.	2026-01-22 09:59:52 +05:30
Dhanji R. Prasanna	be35fa2a7f	fix(cli): wrap ANSI codes in prompt for rustyline compatibility Rustyline needs ANSI escape codes wrapped in \x01...\x02 markers to correctly calculate visible prompt length. Without this, tab completion breaks because rustyline miscalculates cursor position.	2026-01-22 08:30:30 +05:30
Dhanji R. Prasanna	3001df3b1a	style(cli): simplify project prompt format Change from: butler \|[finances]> Change to: butler \| finances>	2026-01-22 08:15:18 +05:30
Dhanji R. Prasanna	af8b849311	fix(read_image): use correct media type when resize fails to reduce size When resize_image_to_dimensions() returns a larger file than the original, we fall back to using the original bytes. Previously, was_resized was set to true if the original dimensions exceeded MAX_IMAGE_DIMENSION, which caused final_media_type to be set to 'image/jpeg' even though we were using the original PNG bytes. This caused Anthropic API errors like: 'Image does not match the provided media type image/jpeg' Fix: Set was_resized=false when falling back to original bytes, so the original media type (detected from magic bytes) is preserved.	2026-01-22 07:58:05 +05:30
Dhanji R. Prasanna	022f5c70a6	feat(cli): show active project name in interactive prompt When a project is loaded via /project, the prompt now shows: agent_name \|[project_name]> where the \|[project_name]> part is displayed in blue. Examples: - Default: g3> - With project: g3 \|[myapp]> - Agent mode: butler> - Agent + project: butler \|[myapp]> The prompt automatically resets when /unproject is called. Added build_prompt() function with 7 unit tests covering all prompt states.	2026-01-22 07:24:00 +05:30
Dhanji R. Prasanna	9325a43ff3	feat(cli): shorten file paths in tool output display Add three-level path shortening hierarchy for cleaner CLI output: 1. Project path -> <project_name>/... (when project loaded via /project) 2. Workspace path -> ./... (relative to current working directory) 3. Home path -> ~/... (fallback for paths under home directory) Changes: - Add shorten_path() and shorten_paths_in_command() functions in display.rs - Add project_path/project_name fields to ConsoleUiWriter - Add set_workspace_path(), set_project_path(), clear_project() to UiWriter trait - Add ui_writer() getter to Agent struct - Wire up project path setting in /project and /unproject commands - Set workspace path when creating agents in all CLI modes Before: ● read_file \| /Users/dhanji/icloud/butler/projects/appa_estate/status.md After: ● read_file \| appa_estate/status.md (with project loaded) ● read_file \| ./src/main.rs (workspace-relative) ● read_file \| ~/Documents/file.txt (home-relative)	2026-01-21 21:27:16 +05:30
Dhanji R. Prasanna	d7d32db4a4	Fix tab completion in agent+chat mode Remove duplicate logging initialization in agent_mode.rs. Logging is already initialized in run() before agent mode is dispatched. The duplicate tracing_subscriber::fmt::layer() was interfering with rustyline's terminal state, breaking tab completion.	2026-01-21 15:24:27 +05:30
Dhanji R. Prasanna	581de4845c	Add /project and /unproject to tab completion	2026-01-21 14:58:23 +05:30
Dhanji R. Prasanna	feb7c3e40d	Add /project and /unproject commands for project-specific context - Add Project struct in crates/g3-cli/src/project.rs with file loading logic - Load brief.md, contacts.yaml, status.md from project path - Load projects.md from workspace root for cross-project context - Project content appended to system message (survives compaction/dehydration) - /project <path> loads project and auto-submits prompt asking about state - /unproject clears project content and resets context - Add set_project_content(), clear_project_content(), has_project_content() to Agent - Add new_for_test_with_readme() for testing with custom README content - Add 6 unit tests for Project struct - Add 9 integration tests for project context behavior	2026-01-21 14:53:30 +05:30
Dhanji R. Prasanna	a34a3b08e9	Rename Project Memory to Workspace Memory Rename all references from "Project Memory" to "Workspace Memory" to avoid future conflation if a "project" concept is introduced later. Changes: - Rename read_project_memory() -> read_workspace_memory() - Update all prompts, tool descriptions, and comments - Update header parsing in memory.rs to use "# Workspace Memory" - Update display detection for "=== Workspace Memory ===" - Update documentation and analysis/memory.md 11 files changed, ~36 occurrences updated.	2026-01-21 14:08:42 +05:30
Dhanji R. Prasanna	6a5ce11e7b	Consolidate redundant assistant message test files Deleted 4 redundant test files (~956 lines): - assistant_message_dedup_test.rs (416 lines, 12 tests) - consecutive_assistant_message_test.rs (248 lines, 6 tests) - missing_assistant_message_test.rs (100 lines, 4 tests) - early_return_path_test.rs (192 lines, 5 tests) - whitebox test Created consolidated assistant_message_test.rs (369 lines, 14 tests): - Helper function tests for consecutive message detection - ContextWindow unit tests for normal and tool execution flows - Bug demonstration tests documenting what bugs looked like - Invariant tests for user/assistant alternation - Missing assistant message fallback logic tests The early_return_path_test was removed because it: - Referenced specific line numbers in production code (brittle) - Reimplemented internal logic (whitebox anti-pattern) - Duplicated coverage from mock_provider_integration_test.rs All 729 g3-core tests pass.	2026-01-21 10:27:07 +05:30
Dhanji R. Prasanna	c5d549c211	Readability pass: remove verbose comments and clean up tests - completion.rs: Remove redundant comments, clean up test output (println! -> let _) - g3_status.rs: Condense doc comments, rename from_str() to parse() - streaming.rs: Remove obvious doc comments that duplicate function names - simple_output.rs, ui_writer_impl.rs: Update Status::parse() calls All changes are behavior-preserving. 132 lines removed, code is more scannable. Agent: carmack	2026-01-21 07:13:20 +05:30
Dhanji R. Prasanna	c4ce853cc6	Fix streaming markdown tests for Dracula heading colors Update test assertions to match new heading color scheme: - H1: bold pink (\x1b[1;95m) instead of bold magenta - H2: purple/magenta (\x1b[35m) - unchanged - H3: cyan (\x1b[36m) instead of magenta	2026-01-21 07:01:53 +05:30
Dhanji R. Prasanna	9397687949	Remove unused mouse control and macax accessibility code Removed dead code that was never used by any g3 tool: - macax/ module (accessibility control via AXApplication, AXElement) - move_mouse() and click_at() methods from ComputerController trait - macax_demo.rs and test_type_text.rs examples The ComputerController trait now only has take_screenshot(), which is the only method actually used by the screenshot tool.	2026-01-21 06:54:31 +05:30
Dhanji R. Prasanna	a89cad955a	Remove VisionBridge OCR (unused) VisionBridge was a Swift library for Apple Vision OCR that was built every compile but never actually used by any g3 tool. Removed: - vision-bridge/ Swift package directory - src/ocr/ module (vision.rs, tesseract.rs, mod.rs) - OCR methods from ComputerController trait - OCR-related code from platform implementations - TextLocation type (no longer needed) - test_vision.rs example Simplified: - build.rs (now empty, no Swift compilation) - MacOSController (no longer holds OCR engine) - LinuxController and WindowsController (stub implementations) Build time improvement: No more 'Building VisionBridge Swift package...' messages on every compile.	2026-01-21 06:42:01 +05:30
Dhanji R. Prasanna	38b0019ad4	Fix compile warnings and tweak error message format Warnings fixed: - Remove unused 'warn' import from retry.rs - Prefix unused 'output' param with underscore - Prefix unused 'rel_start' with underscore - Add #[allow(dead_code)] to G3Status::info() Message format tweaked per feedback: - 'g3: model overloaded [error]' (no attempt info) - 'g3: retrying in 2.2s (1/3) ... [done]' (attempt info moved here) - Handle empty error message in Status::Error to show just '[error]'	2026-01-20 22:49:55 +05:30

1 2 3 4 5 ...

714 Commits