alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Jochen	0bca05a1ba	Merge pull request #62 from cjustice/fix/planning-verbose-flag Fix: Initialize logging before planning mode check	2026-01-15 12:51:11 +11:00
Dhanji R. Prasanna	85ea8fe69c	Update project memory with agent-specific language prompts Document the new agent+language prompt injection feature including: - AGENT_LANGUAGE_PROMPTS static array location - get_agent_language_prompt() and get_agent_language_prompts_for_workspace_with_langs() - File naming pattern: prompts/langs/<agent>.<lang>.md - Instructions for adding new agent+lang prompts	2026-01-15 06:43:42 +05:30
Dhanji R. Prasanna	04e3c69b0a	Add --accept flag to studio run command Automatically accept the session after g3 completes successfully, but only if there are commits on the branch. Changes: - Add --accept flag to Run command (stripped, not passed to g3) - Add has_commits_on_branch() helper using git rev-list --count - Auto-accept triggers merge to main and cleanup when: 1. g3 exits successfully (exit code 0) 2. Branch has commits ahead of main - Show warning if --accept set but no commits exist Usage: studio run --agent carmack --accept	2026-01-15 06:43:35 +05:30
Dhanji R. Prasanna	5d8dbc43f8	Add agent-specific language prompt injection When running in agent mode (e.g., --agent carmack) in a workspace with detected languages, inject agent+language-specific prompts from prompts/langs/<agent>.<lang>.md at the end of the system prompt. Changes: - Add AGENT_LANGUAGE_PROMPTS static array for compile-time embedding - Add get_agent_language_prompt() to look up specific agent+lang combos - Add get_agent_language_prompts_for_workspace_with_langs() that returns both content and matched languages for display - Update agent_mode.rs to inject prompts and show which languages loaded - Display format: '✓ carmack: racket language guidance' - Add tests for new functionality Uses the same detect_languages() mechanism as regular language prompts to avoid code-path aliasing.	2026-01-15 06:43:29 +05:30
Dhanji R. Prasanna	eefc067aae	Add carmack.racket.md agent-specific language prompt Racket-specific guidance for the carmack agent including: - Idiomatic Racket patterns (match, for/*, cond) - Module organization with explicit provide lists - Contracts and type boundaries - Data modeling with structs - Error handling best practices - IO, paths, and portability - Performance considerations - Macro guidelines - Testing with rackunit	2026-01-15 06:43:20 +05:30
Connor Justice	fa29a64e51	Simplify logging initialization comment Removed unnecessary comment about logging initialization.	2026-01-14 17:53:04 -05:00
Connor Justice	505225c0bd	fix: prevent panic when tracing subscriber already initialized Use try_init() instead of init() for tracing subscriber setup to gracefully handle cases where a global subscriber is already set. This fixes a panic in the scout agent subprocess when spawned by the research tool, where a dependency may have already initialized tracing. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-14 15:33:22 -05:00
Connor Justice	6532442d32	fix: initialize logging before planning mode check Move initialize_logging() call to run immediately after CLI parsing, before any mode checks. This ensures the --verbose flag works correctly in planning mode, which previously bypassed logging initialization. Previously, planning mode would return early before initialize_logging() was called, causing verbose output to be silently ignored.	2026-01-14 14:33:44 -05:00
Dhanji R. Prasanna	afec65fd50	Add language-specific prompt injection for toolchain guidance - Add language_prompts module that auto-detects programming languages in workspace - Scan for language files with depth limit (2) to inject relevant toolchain prompts - Add prompts/langs/ directory for language-specific markdown files - Include Racket/raco toolchain guidance as first language prompt - Update combine_project_content() to accept language_content parameter - Integrate language detection into main CLI flow and agent mode - Update project memory with new feature documentation	2026-01-14 21:00:52 +05:30
Dhanji R. Prasanna	716d598bd8	remove openai specific config example	2026-01-14 20:24:53 +05:30
Dhanji R. Prasanna	affa878992	Add minimal OpenAI example config	2026-01-14 20:21:38 +05:30
Dhanji R. Prasanna	f4562cd4c9	config: default agent settings and provider override	2026-01-14 20:14:33 +05:30
Dhanji R. Prasanna	38828c7757	Clean up tool output formatting - Shell: "✅ Command executed successfully" → "⚡️ ran successfully" - Write file: Remove ✏️ emoji, use plain "wrote N lines \| M chars"	2026-01-14 19:42:54 +05:30
Dhanji R. Prasanna	9ef064a041	Add guidance to shell tool description to avoid unnecessary cd prefixes LLMs were prefixing shell commands with `cd <workspace> &&` unnecessarily, wasting tokens and cluttering CLI display. Added clear guidance in the shell tool description that commands already execute in the working directory.	2026-01-14 19:00:53 +05:30
Dhanji R. Prasanna	03143ec7f8	Agent Mode Enhancements • Agent prompts are now embedded within the g3 binary • README.md - Added new "Agent Mode" section documenting: • All 7 built-in agents with their focus areas • Usage examples (--list-agents, --agent <name>) • How to create custom workspace agents Behavior 1. Workspace agents take priority - If agents/<name>.md exists in the workspace, it's used 2. Embedded fallback - If no workspace agent exists, the embedded version is used 3. Portability - g3 binary now works on any repo without needing the agents/ directory 4. Discoverability - g3 --list-agents shows all available agents and their source	2026-01-14 16:27:03 +05:30
Dhanji R. Prasanna	5104bd53b6	refactor(g3-core): improve stream_completion_with_tools readability Extract and simplify the streaming completion function: - Extract ensure_context_capacity() helper for pre-loop context management (thinning + compaction logic now in dedicated async method) - Simplify compact_summary generation block: flatten nested if/match, remove redundant comments, reorder branches for clarity - Remove dead code: unused _last_error variable and modified_tool_call - Streamline duplicate detection block: reduce verbose logging - Clean up text content display block: remove redundant comments, tighten variable declarations - Remove redundant is_todo_tool redefinition inside block expression Net reduction: 79 lines (-187/+108) Behavior unchanged, all unit tests pass. Agent: carmack	2026-01-14 15:11:53 +05:30
Dhanji R. Prasanna	996dc357b4	Skip session resume prompt when --new-session flag is passed When users explicitly pass --new-session, they want a fresh session. Previously g3 would still prompt to resume an existing session. Now the resume check is skipped entirely when the flag is set.	2026-01-14 08:54:35 +05:30
Dhanji R. Prasanna	dea0e6b1ca	Compact tool output improvements - Rename take_screenshot -> screenshot, code_coverage -> coverage (shorter names) - Align \| character across all compact tools (pad to 11 chars for str_replace) - Make code_search a compact tool with summary display - Show language and search name in code_search output (e.g., rust:"find structs") - Add format_code_search_summary() to extract match/file counts from JSON response	2026-01-14 08:12:50 +05:30
Dhanji R. Prasanna	bd25d7dace	Merge sessions/fowler/786b20b5	2026-01-14 04:28:06 +05:30
Dhanji R. Prasanna	7d17b436f9	refactor(g3-core): remove 3 unused Agent constructor variants Remove dead code - constructor variants that had no callers: - new_with_readme() - new_autonomous_with_readme() - new_with_quiet() These were thin wrappers around new_with_mode_and_readme() that were never used externally. All 5 remaining constructors have verified callers. Results: - lib.rs reduced from 2817 to 2797 lines (-20 lines) - Eliminated code-path aliasing: 8 constructors → 5 constructors - All g3-core tests pass - Full workspace compiles cleanly Agent: fowler	2026-01-14 04:26:42 +05:30
Dhanji R. Prasanna	21eb4f2d30	Only show Chrome diagnostics when there are issues Silence the diagnostic report when all checks pass to reduce noise.	2026-01-14 04:25:13 +05:30
Dhanji R. Prasanna	a1dfd9c0b6	Enhanced auto-memory with rich few-shot format - Updated memory reminder prompt with per-symbol char ranges - Added two few-shot examples: Session Continuation (feature) + UTF-8 Safe Slicing (pattern) - Updated system prompt Memory Format section to match - Format: file -> nested symbols with [start..end] ranges and descriptions - Enables direct read_file navigation to specific functions	2026-01-13 21:49:48 +05:30
Dhanji R. Prasanna	3a47ebe668	better racket example support	2026-01-13 21:16:14 +05:30
Dhanji R. Prasanna	c2f96d7048	Make WebDriver and Chrome headless enabled by default - webdriver flag now defaults to true (tools always available) - chrome_headless flag now defaults to true (Chrome is default browser) - Use --safari flag to override and use Safari instead - Updated README documentation to reflect new defaults	2026-01-13 21:14:52 +05:30
Dhanji R. Prasanna	151b8c4658	Add Racket tree-sitter support, remove Kotlin - Add tree-sitter-racket dependency (v0.24) - Initialize Racket parser in code search - Add .rkt, .rktl, .rktd file extensions - Add test_racket_search test - Remove Kotlin from supported languages (was disabled) - Clean up duplicate test files Supported languages: Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Racket	2026-01-13 18:44:59 +05:30
Dhanji R. Prasanna	5e45e110e2	refactor(g3-core): extract finalize_streaming_turn() to unify return paths Extract a single canonical helper function for completing streaming turns, eliminating 3 nearly-identical return paths in stream_completion_with_tools(). Changes: - Add finalize_streaming_turn() helper that handles: - Finishing streaming markdown - Saving context window - Adding timing footer (when requested) - Dehydrating context (when ACD enabled) - Building TaskResult - Replace 3 duplicated return blocks with calls to the helper - Remove unused mut on full_response variable Results: - Function reduced from 1067 to 999 lines (-68 lines) - Eliminated code-path aliasing: 3 paths → 1 canonical path - All 32 characterization tests pass - Full g3-core test suite passes Agent: fowler	2026-01-13 16:52:48 +05:30
Dhanji R. Prasanna	333a85ed1e	Merge sessions/hopper/e2a0ad02	2026-01-13 16:27:17 +05:30
Dhanji R. Prasanna	b89d55a9ff	Add characterization tests for stream_completion_with_tools Add 32 blackbox characterization tests to lock down the behavior of the stream_completion_with_tools function (1067 lines) before refactoring. Tests cover key behaviors through stable boundaries: - StreamingToolParser: tool call detection, incomplete detection, text accumulation - Auto-continue logic: autonomous mode decisions, priority ordering - Duplicate detection: sequential duplicates, cross-message duplicates - Context window: token tracking, compaction threshold, history preservation - Tool execution: read_file, shell, write_file, todo tools through Agent - Streaming utilities: LLM token cleaning, duration formatting, truncation - Parser sanitization: inline tool pattern handling, homoglyph replacement These tests intentionally do NOT assert: - Internal parser state or implementation details - Specific timing values - UI output formatting - Provider-specific behavior Agent: hopper	2026-01-13 16:25:33 +05:30
Dhanji R. Prasanna	bd756307f1	fowler doesnt need to explicity read README/AGENTS	2026-01-13 16:16:27 +05:30
Dhanji R. Prasanna	47e3a88cf6	refactor(g3-core): extract stats formatting to dedicated module Extract the get_stats() function (158 lines) from lib.rs to a new stats.rs module. Changes: - Create stats.rs with AgentStatsSnapshot struct for capturing agent state - Replace inline formatting logic with delegation to snapshot.format() - Add unit tests for stats formatting (empty and populated states) - Reduce lib.rs from 2961 to 2818 lines (-143 lines) The new module improves: - Testability: Stats formatting can now be unit tested in isolation - Separation of concerns: Formatting logic is decoupled from Agent struct - Readability: lib.rs is more focused on core agent behavior All 271 workspace tests pass. Agent: fowler	2026-01-13 16:11:53 +05:30
Dhanji R. Prasanna	562c4199f8	docs: Add Studio documentation and UTF-8 safety invariants README.md: - Add Studio section documenting the multi-agent workspace manager - Document usage: run, list, status, accept, discard commands - Explain worktree-based isolation and workflow AGENTS.md: - Add UTF-8 safe string slicing as critical invariant (#8) - Add MUST NOT for byte-index slicing on multi-byte text (#5) - Document parser sanitization as dangerous/subtle code path (prevents parser poisoning from inline tool-call JSON patterns) Agent: lamport	2026-01-13 15:31:01 +05:30
Dhanji R. Prasanna	9a3b03a41f	Remove flock mode (superseded by studio) Flock mode has been superseded by the studio multi-agent workspace manager. Changes: - Remove g3-ensembles crate entirely - Remove --project, --flock-workspace, --segments, --flock-max-turns CLI flags - Remove run_flock_mode() from autonomous.rs - Remove flock-related tests from cli_integration_test.rs - Update README.md, docs/architecture.md, analysis/memory.md - Delete docs/FLOCK_MODE.md	2026-01-13 15:01:12 +05:30
Dhanji R. Prasanna	82c0165765	Fix unused variable warning and UTF-8 panic in string slicing - Remove unused total_lines variable in file_ops.rs - Fix UTF-8 boundary panic in utils.rs when generating diff error preview The code was slicing at byte index 200 which could land inside a multi-byte character (e.g., box-drawing chars like ─). Now uses character-based slicing with chars().take() instead.	2026-01-13 14:52:52 +05:30
Dhanji R. Prasanna	c65d082c5d	Make --agent optional in Studio for one-shot mode Studio can now run g3 without specifying an agent: # Agent mode (existing) studio run --agent carmack "fix the bug" # One-shot mode (new) studio run "fix the bug" When no agent is specified, sessions are created under the 'single' directory in .worktrees/sessions/single/<session-id>/ This makes Studio a complete replacement for Flock mode.	2026-01-13 14:42:20 +05:30
Dhanji R. Prasanna	f6b84d864a	Rename G3 -> g3 in docs and comments Standardize project name to lowercase 'g3' throughout documentation, comments, and configuration files. Environment variables (G3_*) are unchanged as they follow the uppercase convention.	2026-01-13 14:36:33 +05:30
Dhanji R. Prasanna	389ed6a554	Compact project info display in interactive mode Before: 🤖 AGENTS.md configuration loaded 📚 detected: G3 - AI Coding Agent 🧠 Project memory loaded workspace: /Users/dhanji/src/g3 After: >> G3 - AI Coding Agent ✓ README \| ✓ AGENTS.md \| ✓ Memory -> ~/src/g3	2026-01-13 14:32:24 +05:30
Dhanji R. Prasanna	af3aa840db	Compress session continuation UI prompt	2026-01-13 14:29:54 +05:30
Dhanji R. Prasanna	118935d2da	Remove unused variable total_lines in file_ops.rs	2026-01-13 14:25:17 +05:30
Dhanji R. Prasanna	a09967eb27	refactor(streaming): Extract deduplication and auto-continue logic into helpers Improve readability of stream_completion_with_tools (~1000 line function): - Add deduplicate_tool_calls() helper with closure for previous-message check - Add should_auto_continue() with AutoContinueReason enum for clearer control flow - Replace inline deduplication loop with helper call (-19 lines) - Replace complex auto-continue conditional with match on reason enum (-13 lines) - Add section comments for major phases (State Init, Pre-loop, Main Loop, Auto-Continue, Post-Loop) - Add comprehensive tests for new helpers Net reduction: 82 deletions, behavior unchanged (172+ tests pass) Agent: carmack	2026-01-13 11:44:06 +05:30
Dhanji R. Prasanna	dc45987e8d	Add characterization tests for UTF-8 truncation and parser sanitization Agent: hopper Adds 32 new integration tests covering recent commits: ## UTF-8 Safe Truncation Tests (14 tests) Covers commit `f30f145` (Fix UTF-8 panics): - Topic extraction with emoji, CJK, and multi-byte characters - Truncation at character boundaries (not byte boundaries) - Edge cases: exactly 50 chars, 51 chars, 2-byte/3-byte/4-byte UTF-8 - Stub generation with multi-byte topics - Combining characters and diacritics ## Parser Sanitization Tests (18 tests) Covers commit `4c36cc0` (Prevent parser poisoning): - Code block contexts (inline code, after fences, prose) - Line boundary edge cases (empty lines, whitespace, indentation) - Unicode handling (emoji, bullets, CJK before patterns) - Multiple patterns on same line - Negative cases (similar but different patterns, partial patterns) - Real-world scenarios from the original bug report All tests are blackbox/characterization style - they test observable outputs through stable public interfaces without encoding internal implementation details.	2026-01-13 11:22:46 +05:30
Dhanji R. Prasanna	8dcb7a3dba	feat: add compact styled output for TODO tools TODO tools (todo_read, todo_write) now display with a cleaner, more compact format: - Styled header: " ● todo_read" or " ● todo_write" - Tree-style prefixes for content lines (│ and └) - Checkbox conversion: "- [ ]" → □, "- [x]" → ■ - Dimmed content for visual distinction - No timing footer (cleaner output) Changes: - Add print_todo_compact() method to UiWriter trait - Implement print_todo_compact() in ConsoleUiWriter - Update todo.rs to call print_todo_compact() instead of line-by-line output - Skip tool header, output header, and timing for TODO tools in agent streaming	2026-01-13 10:58:55 +05:30
Dhanji R. Prasanna	4c36cc058c	fix: prevent parser poisoning from inline tool-call JSON patterns When the streaming parser encountered fragments of JSON that looked like partial tool calls (e.g., {"tool":) embedded in inline text (like code examples or prose), it would incorrectly enter JSON parsing mode and poison the parser state, causing control to be returned to the user mid-task. This fix: - Adds sanitize_inline_tool_patterns() to detect tool-call patterns that are NOT on their own line and replace the opening brace with a Unicode homoglyph (fullwidth left curly bracket U+FF5B) - Integrates sanitization into process_chunk() before text is buffered - Updates system prompts to instruct LLMs to use homoglyphs when showing example tool call JSON in prose - Adds comprehensive tests for the sanitization logic Real tool calls from LLMs always appear on their own line, so those are left untouched. Only inline patterns (with non-whitespace before them) are sanitized.	2026-01-13 10:58:41 +05:30
Dhanji R. Prasanna	a0b9126555	Revert "refactor(g3-core): extract streaming logic to agent_streaming.rs" This reverts commit `a2e51cf075`.	2026-01-13 07:59:18 +05:30
Dhanji R. Prasanna	6907fa36c0	UI: Add newline before auto-memory skip message	2026-01-13 07:03:42 +05:30
Dhanji R. Prasanna	08e6a1dca0	Merge sessions/fowler/f8c3f2e5	2026-01-13 06:58:01 +05:30
Dhanji R. Prasanna	98eea09dc8	UI: Show consecutive read_file calls as continuation lines When the LLM reads the same file multiple times in sequence (scrolling through a large file), instead of showing each as a separate line: ● read_file \| path [0..2000] \| 50 lines \| 100 ◉ 5ms ● read_file \| path [2000..4000] \| 50 lines \| 100 ◉ 5ms ● read_file \| path [4000..6000] \| 50 lines \| 100 ◉ 5ms Now shows a cleaner continuation format: ● read_file \| path [0..2000] \| 50 lines \| 100 ◉ 5ms └─ reading further [2000..4000] \| 50 lines \| 100 ◉ 5ms └─ reading further [4000..6000] \| 50 lines \| 100 ◉ 5ms This makes it visually clear that the agent is scrolling through a single file rather than reading multiple different files. Implementation: - Added last_read_file_path field to ConsoleUiWriter - Detect when consecutive read_file calls target the same file - Print continuation format for subsequent reads - Reset tracking when: - A different tool is executed (shell, write_file, etc.) - A different file is read - Text is output between tool calls	2026-01-13 06:25:28 +05:30
Dhanji R. Prasanna	a2e51cf075	refactor(g3-core): extract streaming logic to agent_streaming.rs Reduce lib.rs complexity by extracting the streaming completion logic: - Extract stream_completion_with_tools (~1080 lines) to agent_streaming.rs - Extract stream_with_retry helper method - Extract parse_diff_stats helper function - Add handle_pre_stream_compaction helper for cleaner pre-stream logic - Add format_tool_output helper for tool output formatting - Remove 3 unused constructor variants: - new_with_readme - new_autonomous_with_readme - new_with_quiet Results: - lib.rs reduced from 2974 to 1791 lines (40% reduction) - Streaming logic cleanly separated into dedicated module - All tests pass, no behavior changes Agent: fowler	2026-01-13 06:14:56 +05:30
Dhanji R. Prasanna	5c9404e292	Refactor: improve readability in CLI modules - project_files.rs: Fix UTF-8 safety in truncate_for_display (use char boundaries instead of byte slicing), add test for multi-byte chars - task_execution.rs: Extract recoverable_error_name() helper, use shared calculate_retry_delay() from error_handling.rs to eliminate duplication - ui_writer_impl.rs: Extract duration_color() helper for timing display, add clear_tool_state() to consolidate repeated mutex clearing patterns Agent: carmack	2026-01-13 05:58:54 +05:30
Dhanji R. Prasanna	f30f145c85	Fix UTF-8 panics and inconsistent retry logic - Fix 7 UTF-8 byte slicing panics that crash on multi-byte characters: - acd.rs: extract_topic_from_text() [..50] slice - streaming.rs: log_stream_error() [..500] slice - tools/acd.rs: rehydrate message truncation [..2000] slice - history.rs: git commit message truncation [..69] slice - planner.rs: commit summary/description truncation [..69] slices - llm.rs: requirements summary line truncation [..117] slice - All now use chars().count() and chars().take(N).collect() for UTF-8 safe truncation - Fix inconsistent retry logic in task_execution.rs: - Previously only retried on Timeout errors - Now retries on ALL recoverable errors (rate limits, network, server errors, model busy, token limits, context length) - Added error-specific base delays (rate limit: 5s, server: 2s, etc.) - Added exponential backoff with ±20% jitter - Consistent with autonomous mode retry behavior	2026-01-13 05:49:45 +05:30
Dhanji R. Prasanna	6f50d01ab6	Add comprehensive end-of-turn behavior tests for g3-core Agent: hopper Adds 56 new integration tests covering the observable end-of-turn behaviors in the streaming module: - Timing footer formatting (5 tests): verifies user-facing timing display with various durations, token counts, and context percentages - Tool call duplicate detection (6 tests): ensures identical sequential tool calls are detected while different tools/args are not - Empty response detection (9 tests): validates detection of empty, whitespace-only, and timing-only responses that trigger auto-continue - Connection error classification (5 tests): verifies EOF, connection, chunk, and body errors are correctly identified for graceful recovery - Tool output summary formatting (17 tests): covers read_file, write_file, str_replace, remember, screenshot, coverage, and rehydrate summaries - Duration formatting (4 tests): milliseconds, seconds, minutes, zero - Text truncation (4 tests): short/long strings, multiline, flag behavior - LLM token cleaning (3 tests): removal of stop tokens like <\|im_end\|> - Edge cases (4 tests): empty inputs, unicode handling, large numbers All tests are blackbox/characterization style - they test observable outputs through stable public interfaces without encoding internal implementation details. Tests remain stable under refactoring that preserves behavior.	2026-01-12 21:17:32 +05:30

1 2 3 4 5 ...

609 Commits