Commit Graph

271 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
8926775acb Add session continuation symlink fix and /resume command
Fix session detection:
- Add save_session_continuation() calls at all session exit points
- Sessions now properly create .g3/session symlink for resume detection
- Fixes issue where g3 wasn't offering to resume previous sessions

Add /resume command:
- New list_sessions_for_directory() to scan available sessions
- New switch_to_session() method to safely switch between sessions
- Shows numbered list with timestamps, context %, and TODO status
- Saves current session before switching (can be resumed later)
- Restores full context if <80% used, otherwise uses summary
- Machine mode supports /resume and /resume <number>

Documentation:
- Add /clear and /resume to CONTROL_COMMANDS.md
- Update /help output with new commands
2026-01-11 05:30:58 +08:00
Dhanji R. Prasanna
86709834e2 Improve research tool error reporting for scout agent failures
When the scout agent fails (e.g., context window exhaustion), now:
- Captures both stdout and stderr from the scout process
- Detects context window exhaustion errors with specific patterns
- Provides detailed, actionable error messages to the user
- Shows suggestions for how to work around the issue
- Includes technical details (exit code, error output) for debugging

Handles two failure modes:
1. Scout agent exits with non-zero status
2. Scout agent exits successfully but doesn't produce valid report markers

Both cases now surface clear error messages instead of cryptic failures.
2026-01-10 20:50:43 +11:00
Dhanji R. Prasanna
60aeb67c56 Add stealth mode for Chrome headless to evade bot detection
Implements comprehensive anti-detection measures:
- Override navigator.webdriver to return undefined
- Inject fake chrome.runtime, chrome.loadTimes, chrome.csi objects
- Add realistic plugins and mimeTypes arrays
- Patch permissions API to hide automation
- Set realistic navigator properties (languages, hardwareConcurrency, deviceMemory)
- Remove ChromeDriver-specific window properties (cdc_*)
- Patch Function.prototype.toString to hide modifications
- Add Chrome flags: --disable-blink-features=AutomationControlled
- Set realistic user-agent without HeadlessChrome identifier
- Exclude 'enable-automation' switch

Tested against bot detection sites:
- bot.sannysoft.com: All major tests pass
- Search engines: Works with DuckDuckGo, Yahoo, Brave, Startpage
- Still detected by: Google reCAPTCHA, Cloudflare Turnstile, Bing
2026-01-10 20:34:14 +11:00
Dhanji R. Prasanna
6be0a03c4c Fix timing footer being saved to context window
The timing footer (e.g., ⏱️ 19.4s | 💭 4.7s) was being saved to the
conversation history as a separate assistant message. This happened
because stream_completion_with_tools returns the timing footer in
TaskResult.response for display, but the caller was also saving it
to context.

Fix: Strip the timing footer (identified by \n\n⏱️) before saving
to context window. The timing footer remains display-only.

Also includes:
- Research tool blank line fix: only add visual separator for research
  tool output, not all tools
- Research tool webdriver propagation: pass parent's webdriver browser
  choice (Safari vs Chrome headless) to scout subprocess
2026-01-10 15:55:59 +11:00
Dhanji R. Prasanna
68c9135913 Fix research tool UI: remove duplicate header, add footer spacing, remove spinner, widen command display
- Remove duplicate tool header (lib.rs already prints it)
- Add newline before timing footer for visual separation
- Remove spinner animation (incompatible with update_tool_output_line)
- Change shell command format to " > `cmd` ..." with 60 char width
2026-01-10 15:20:40 +11:00
Dhanji R. Prasanna
0aa1287ca6 Remove final_output tool and improve scout report handback
final_output removal:
- Remove final_output from tool definitions and dispatch
- Update system prompts to request summaries as regular text
- Remove final_output_called field from StreamingState
- Update auto_continue tests to remove final_output_called parameter
- Remove final_output test from tool_execution_test.rs
- Update planner and flock prompts to not reference final_output
- Keep backwards-compat code in feedback_extraction.rs and task_result.rs

Scout report handback:
- Change from file-based to delimiter-based report extraction
- Scout outputs report between ---SCOUT_REPORT_START/END--- markers
- Research tool extracts content between markers, strips ANSI codes
- Add comprehensive tests for extraction and ANSI stripping

657 tests pass.
2026-01-10 13:43:04 +11:00
Dhanji R. Prasanna
cab2fb187a Stream scout agent output to CLI during research
The research tool now streams the underlying scout agent's output
to the CLI in real-time for visual indication of progress. This
output is displayed but not added to the conversation context.
2026-01-09 20:39:53 +11:00
Dhanji R. Prasanna
22d1ac8096 Move WebDriver instructions from main prompt to scout agent
Simplified the main system prompt's web research section to just direct
users to the research tool. Moved the detailed WebDriver usage instructions
to scout.md where they belong, since the scout agent is the one that
actually uses WebDriver for research.

Main prompt now simply says: use the research tool for web research.
Scout agent now has the full WebDriver best practices documentation.
2026-01-09 16:01:47 +11:00
Dhanji R. Prasanna
33e5705fc3 Add research tool for web-based research via scout agent
New tool that spawns a scout agent to perform web research and return
a structured research brief. The scout agent uses webdriver to browse
the web and returns a decision-ready report.

Changes:
- Added 'research' tool definition (12 core tools total)
- Added research tool dispatch in tool_dispatch.rs
- Created tools/research.rs implementation:
  - Spawns 'g3 --agent scout <query>' as subprocess
  - Captures stdout and extracts last line (report file path)
  - Reads and returns the report file contents
- Added exclude_research flag to ToolConfig
- Scout agent (agent_name == 'scout') does NOT have access to research
  tool to prevent infinite recursion
- Updated system prompts to describe when to use research tool
- Added scout.md agent prompt with research brief output contract

The research tool is preferred for complex research tasks (APIs, SDKs,
libraries, approaches, bugs). WebDriver can still be used directly for
simple lookups or fine-grained control.
2026-01-09 15:59:19 +11:00
Dhanji R. Prasanna
de50726eeb Prefer ripgrep over grep in system prompts
Added guidance to use rg (ripgrep) instead of grep in shell commands.
Ripgrep is faster, has better defaults, and respects .gitignore.
2026-01-09 15:28:04 +11:00
Dhanji R. Prasanna
e301075666 Fix panic on multi-byte chars in filter_json buffer truncation
The buffer truncation code was slicing at a raw byte offset which could
land in the middle of a multi-byte character (like emojis), causing a
panic. Fixed by using char_indices() to find valid character boundaries.

Also added stop_reason field to CompletionChunk initializers in tests
to complete the stop_reason feature addition.

- Fix byte boundary panic in filter_json.rs line 327
- Add test for multi-byte character handling
- Update test files with missing stop_reason field
2026-01-09 15:20:57 +11:00
Dhanji R. Prasanna
c470964628 Fix: Save LLM text response to context after tool execution
When the LLM executes a tool and then outputs text (e.g., analysis after
reading images), the text was being displayed during streaming but never
saved to the context window. This caused:

1. The response to appear truncated in the session log
2. Loss of context for subsequent turns
3. The LLM losing track of what it had already said

The fix saves current_response to the context window before breaking
out of the streaming loop for auto-continue after tool execution.

Reproduction scenario:
- User asks LLM to read images and analyze them
- LLM calls read_image tool
- Tool executes successfully
- LLM outputs analysis text ("Now I can see the results...")
- Text was displayed but lost from session log

Now the text is properly persisted to the context window.
2026-01-09 15:04:43 +11:00
Dhanji R. Prasanna
777191b3cb Remove final_output tool - let summaries stream naturally
- Remove final_output from tool definitions, dispatch, and misc tools
- Update system prompts to request summaries as regular markdown text
- Remove print_final_output from UiWriter trait and all implementations
- Remove final_output handling from agent core logic
- Rename final_output_summary → summary in session continuation
- Delete final_output test files
- Update tool count tests (12→11, 27→26)

This allows LLM summaries to stream through the markdown formatter
for a more natural, responsive user experience instead of buffering
everything into a tool call.
2026-01-09 14:57:24 +11:00
Dhanji R. Prasanna
bebf04c7bd Tighten system prompt 2026-01-09 14:11:19 +11:00
Dhanji R. Prasanna
67be0f20c7 fix: remove allow_multiple_tool_calls config and simplify tool execution flow
This fixes a bug where the agent would stop responding abruptly without
calling final_output. The root cause was the allow_multiple_tool_calls
config option (default: false) which caused the agent to break out of
the streaming loop mid-stream after executing the first tool, losing
any subsequent content.

Changes:
- Remove allow_multiple_tool_calls config option entirely
- Always process all tool calls without breaking mid-stream
- Simplify system prompt generation (no longer needs boolean param)
- Let the stream complete fully before continuing to next iteration
- Change find_last_tool_call_start to find_first_tool_call_start
- Remove parser.reset() call on duplicate detection

Benefits:
- Simpler logic with less conditional branching
- No lost content after tool calls
- Consistent behavior for all users
- Reduced config complexity
2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna
347513b04c Add comprehensive stress tests for streaming markdown formatter
Add 10 stress tests covering:
- Nested formatting (bold in italic, italic in bold)
- Empty/minimal content edge cases
- Escape sequences and special characters
- Lists with complex inline formatting
- Links with various content types
- Tables with formatting in cells
- Code blocks (should not format contents)
- Mixed block elements (headers, quotes, rules)
- Nested lists (3+ levels, mixed types)
- Pathological/adversarial inputs (unbalanced delimiters, unicode, long lines)

All 45 tests pass.
2026-01-08 20:27:28 +11:00
Dhanji R. Prasanna
381b852869 refactor(g3-core): Extract streaming utilities into dedicated module
Extract reusable utilities from the massive stream_completion_with_tools
function into a new streaming.rs module for improved readability:

- format_duration, format_timing_footer: timing display helpers
- clean_llm_tokens: consolidates 4 duplicate token-cleaning call sites
- log_stream_error: extracts 70+ lines of error logging
- is_empty_response, is_connection_error: predicate helpers
- truncate_for_display, truncate_line: string truncation utilities
- StreamingState, IterationState: state structs for future refactoring

Results:
- lib.rs reduced from 2978 to 2840 lines (138 lines, ~5%)
- New streaming.rs: 309 lines with 5 unit tests
- All 98+ tests pass

Agent: carmack
2026-01-08 13:20:11 +11:00
Dhanji R. Prasanna
267ef00848 refactor: extract session helper in webdriver.rs to reduce boilerplate
Agent: carmack

Add get_session() helper function that:
- Checks if webdriver is enabled
- Acquires the session read lock
- Returns the cloned session or an error message

Refactored 12 webdriver tool functions to use this helper:
- execute_webdriver_navigate
- execute_webdriver_get_url
- execute_webdriver_get_title
- execute_webdriver_find_element
- execute_webdriver_find_elements
- execute_webdriver_click
- execute_webdriver_send_keys
- execute_webdriver_execute_script
- execute_webdriver_get_page_source
- execute_webdriver_screenshot
- execute_webdriver_back
- execute_webdriver_forward
- execute_webdriver_refresh

Each function previously had ~10 lines of identical boilerplate.
Now reduced to 4 lines using the helper.

Net reduction: 68 lines (678 -> 610)
All tests pass. Behavior unchanged.
2026-01-08 13:05:44 +11:00
Dhanji R. Prasanna
5bfaee8dd5 use consistent naming for compaction 2026-01-08 12:54:03 +11:00
Dhanji R. Prasanna
bb63050779 refactor: improve readability of streaming and file ops code
Agent: carmack

databricks.rs:
- Extract ToolCallAccumulator struct to replace opaque (String, String, String) tuple
- Add decode_utf8_streaming() helper for cleaner UTF-8 handling
- Add is_incomplete_json_error() helper for JSON parse error detection
- Add make_final_chunk() helper to reduce duplication
- Add finalize_tool_calls() to convert accumulators to final format
- Refactor parse_streaming_response from ~270 lines to ~100 lines
- Reduce nesting depth from 8+ levels to 4 levels
- Use early returns and let-else for cleaner control flow

file_ops.rs:
- Replace repetitive if-let chains with declarative PATH_CONTENT_KEYS table
- Use match expression instead of nested if-else
- Reduce extract_path_and_content from 44 lines to 20 lines

All tests pass. Behavior unchanged.
2026-01-07 12:39:05 +11:00
Dhanji R. Prasanna
4e7aca50fa feat: royal blue tool names in agent mode + fix README heading display
- Add set_agent_mode() to UiWriter trait for visual mode differentiation
- ConsoleUiWriter uses royal blue (ANSI 256 color 69) for tool names in agent mode
- Fix extract_readme_heading() to search only README section of combined content
  (was incorrectly showing AGENTS.md heading instead of README heading)
2026-01-07 11:37:51 +11:00
Dhanji R. Prasanna
1980e62511 Improve code readability in g3-core
- streaming_parser.rs: Rename has_message_like_keys to args_contain_prose_fragments
  with improved documentation explaining the heuristic for detecting malformed
  tool calls where LLM prose leaked into JSON keys

- context_window.rs: Simplify build_thin_result_message using early return
  pattern and match expression for cleaner control flow

Agent: carmack
2026-01-07 11:16:42 +11:00
Dhanji R. Prasanna
48036d01e3 fix(g3-core): disable auto-continue in interactive mode
Auto-continue was incorrectly triggering when the LLM asked questions
in interactive/chat mode. Now auto-continue only activates when
is_autonomous is true, allowing proper back-and-forth conversation
in interactive mode.

Agent: fowler
2026-01-07 10:37:30 +11:00
Dhanji R. Prasanna
b73dfacb7a refactor(g3-core): extract provider_registration and session modules
Extract two focused modules from the monolithic lib.rs (3372 lines):

1. provider_registration.rs (233 lines)
   - Consolidates duplicated provider registration patterns
   - Single determine_providers_to_register() function for mode-based selection
   - Unified register_providers() async function for all provider types
   - Includes unit tests for registration logic

2. session.rs (394 lines)
   - Session ID generation (generate_session_id)
   - Context window persistence (save_context_window, write_context_window_summary)
   - Error logging (log_error_to_session)
   - Utility functions (format_token_count, token_indicator)
   - Session restoration helper (restore_from_session_log)
   - Includes comprehensive unit tests

Also fixes:
- Removed redundant tool_executed assignment that triggered unused warning
- Removed unused Message import in session.rs

Results:
- lib.rs reduced from 3372 to 2976 lines (-396 lines, -11.7%)
- All tests pass, no warnings
- Behavior preserved (pure mechanical extraction)

Agent: fowler
2026-01-07 10:20:28 +11:00
Dhanji R. Prasanna
5d20da2609 Add 54 integration tests for CLI, tools, and message serialization
New test files:
- crates/g3-cli/tests/cli_integration_test.rs (14 tests)
  Blackbox CLI tests: help/version flags, argument validation,
  conflicting modes, flock mode requirements

- crates/g3-core/tests/tool_execution_test.rs (20 tests)
  Tool call structure tests and unified diff application:
  read_file, write_file, str_replace, shell, background_process,
  todo, final_output, code_search, take_screenshot

- crates/g3-providers/tests/message_serialization_test.rs (20 tests)
  Round-trip serialization tests for Message, MessageRole,
  CacheControl, and Tool types. Covers Unicode, special chars,
  and edge cases.

All tests follow blackbox/integration-first principles with
documentation of what they protect and intentionally do not assert.
2026-01-07 09:23:34 +11:00
Dhanji R. Prasanna
e2445a5d22 refactor(g3-core): extract duplicate detection helper and consolidate thinning
- Extract check_duplicate_in_previous_message() helper to reduce nesting
  from 6+ levels to 2 levels in stream_completion_with_tools
- Create do_thin_context() and do_thin_context_all() helpers to centralize
  context thinning with event tracking
- Use provider_config::parse_provider_ref() in additional call sites
- All 295 tests pass

This continues the refactoring to eliminate code-path aliasing and
reduce cyclomatic complexity in the Agent implementation.
2026-01-07 08:45:51 +11:00
Dhanji R. Prasanna
386176899e Remove vision tools (except take_screenshot) and macax tools
Vision tools removed:
- extract_text (OCR from image files)
- extract_text_with_boxes (OCR with bounding boxes)
- vision_find_text (find text in app windows)
- vision_click_text (find and click on text)
- vision_click_near_text (click near text labels)

macax tools removed:
- macax_list_apps
- macax_get_frontmost_app
- macax_activate_app
- macax_press_key
- macax_type_text

The LLM can now read images directly via read_image tool.
take_screenshot is retained for capturing application windows.

Files deleted:
- crates/g3-core/src/tools/vision.rs
- crates/g3-core/src/tools/macax.rs
- docs/macax-tools.md

Updated tool counts: 12 core + 15 webdriver = 27 total
2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna
29e263ac49 Fix Unicode space handling in macOS screenshot filenames
macOS uses U+202F (Narrow No-Break Space) in screenshot filenames
between the time and am/pm. When users type or paste these paths,
they use regular spaces, causing file-not-found errors.

Changes:
- Add resolve_path_with_unicode_fallback() to try U+202F variants
- Add resolve_paths_in_shell_command() for shell command paths
- Apply fix to read_file, read_image, and shell tools
- Fix read_image prompt docs: file_path -> file_paths (array)
- Add 6 unit tests for Unicode space normalization
2026-01-03 17:17:08 +11:00
Dhanji R. Prasanna
f4a1bf5e93 fix agent-mode session resumption bug 2026-01-03 16:44:58 +11:00
Dhanji R. Prasanna
76bfb77f84 further fowler fixes and session fixes 2026-01-03 15:47:04 +11:00
Dhanji R. Prasanna
65867e7f96 refactor tools out of lib.rs 2026-01-03 15:06:34 +11:00
Dhanji R. Prasanna
595ad6ad21 agent mode resumption 2026-01-03 14:50:08 +11:00
Dhanji R. Prasanna
016efc1db6 Prevent agent mode from stopping after first TODO phase
- Add TODO completion check to final_output tool in autonomous mode only
- When incomplete TODO items exist, reject final_output and prompt LLM to continue
- Non-autonomous modes (interactive, chat) are unaffected
- Add 6 tests verifying behavior in both autonomous and non-autonomous modes

Fixes issue where LLM would call final_output after completing first phase,
causing agent to stop prematurely instead of continuing with remaining phases.
2025-12-27 12:35:31 +11:00
Dhanji R. Prasanna
4c25e43ee4 refactoring 2025-12-26 15:16:12 +11:00
Dhanji R. Prasanna
666be4ff40 Fix duplicate tool call handling: move tool_executed flag and reset parser
- Move tool_executed = true after duplicate check to prevent auto-continue
  from triggering when only duplicate tools were detected
- Reset parser state when duplicate detected to clear any partial/polluted
  state from LLM stuttering or example tool calls in markdown blocks
2025-12-26 11:55:57 +11:00
Dhanji R. Prasanna
46611d9e13 Improve read_image output formatting
- Add newline after └─ before first image preview
- Show only filename (not full path) in info line
2025-12-26 11:36:10 +11:00
Dhanji R. Prasanna
2a4dad2842 Update read_image output with box drawing characters
- Print └─ before images to break out of tool output box
- Print ┌─ after images to resume tool output box
- Remove │ prefix from image preview and info lines
- Info line uses single space prefix, dimmed text
- Only include error messages in tool result (success info printed via imgcat)
2025-12-26 11:29:33 +11:00
Dhanji R. Prasanna
e688d3b29f Simplify read_image imgcat output formatting
- Remove │ prefix before image preview, use single space instead
- Keep info line on its own line with │ prefix
- Keep blank line spacing between images
2025-12-26 11:24:13 +11:00
Dhanji R. Prasanna
3601cc0547 Enhance read_image tool with magic byte detection and multi-image support
- Fix media type detection using magic bytes instead of file extension
  - Correctly identifies JPEG files with .png extension (and vice versa)
  - Supports PNG, JPEG, GIF, and WebP formats

- Add multi-image support with file_paths array parameter
  - Load multiple images in a single tool call
  - All images queued for LLM analysis

- Enhanced CLI output:
  - Inline image preview via iTerm2 imgcat protocol (height=5)
  - Dimmed info line showing: path | dimensions | media type | file size
  - Proper │ prefix alignment with tool output boxing
  - Human-readable file sizes (bytes, KB, MB)

- Add image dimension extraction from file headers
  - PNG, JPEG, GIF, WebP dimension parsing

- Add comprehensive tests for magic byte detection and dimensions
2025-12-26 11:19:37 +11:00
Dhanji R. Prasanna
3ece02ff31 fix: resolve compiler warnings across crates
- Remove unused assignment to final_output_called (returns immediately after)
- Mark cache_config field as #[allow(dead_code)] (reserved for future use)
- Mark print_status_line method as #[allow(dead_code)] (reserved for future use)
2025-12-25 18:47:22 +11:00
Dhanji R. Prasanna
258f9878ff style: use ◉ symbol for token count in timing footer
Changes '227tk | 48% ctx' to '227 ◉ | 48%' for a cleaner look.
2025-12-25 18:40:17 +11:00
Dhanji R. Prasanna
d09c80180e fix: remove redundant TODO list header that breaks boxing effect 2025-12-25 18:34:51 +11:00
Dhanji R. Prasanna
64f27c0abc feat: move TODO lists to session-scoped directories
TODO lists are now stored in .g3/sessions/<session_id>/todo.g3.md instead
of the workspace root. This prevents different g3 sessions from accidentally
picking up or overwriting each other's TODOs.

Changes:
- Add get_session_todo_path() function in paths.rs
- Update todo_read/todo_write handlers to use session-specific paths
- Remove TODO loading at Agent initialization (sessions start fresh)
- Update prompts to reflect session-scoped behavior

Fallback behavior preserved for planner mode (G3_TODO_PATH env var).
2025-12-25 18:33:03 +11:00
Dhanji R. Prasanna
d9c58576a1 feat: add background_process tool for launching long-running processes
Adds a new tool that allows launching processes (like game servers) in the
background while g3 continues to operate. The process runs independently
with stdout/stderr captured to a log file.

Features:
- Named process tracking for easy reference
- Automatic log capture to logs/background_processes/
- Returns PID and log file path for use with shell tool
- Automatic cleanup on agent shutdown via Drop trait

Usage: Use shell tool to interact with the process:
- Read logs: tail -100 <logfile>
- Check status: ps -p <pid>
- Stop process: kill <pid>

Files:
- New: crates/g3-core/src/background_process.rs
- New: crates/g3-core/tests/background_process_demo_test.rs
- Modified: crates/g3-core/src/lib.rs (tool definition + handler)
- Modified: crates/g3-core/src/prompts.rs (documentation)
2025-12-25 18:23:10 +11:00
Dhanji R. Prasanna
9ff5ba6098 Fix auto-continue false positives from tool-call-like content
When the LLM outputs text containing tool call patterns (e.g., reading
log files, showing examples, or discussing tool calls), the parser's
has_unexecuted_tool_call() would detect these as real tool calls and
trigger auto-continue, leading to repeated empty responses.

The fix: mark the parser buffer as consumed when content is displayed.
This prevents tool-call-like patterns in displayed text from triggering
false positives later. The fix is safe because:

1. Only runs when no tool was detected (inside 'if !tool_executed')
2. Legitimate tool calls are detected first by process_chunk()
3. Matches existing pattern of calling mark_tool_calls_consumed()
   after tool execution
2025-12-25 17:55:13 +11:00
Dhanji R. Prasanna
f9d0c33461 Revert "Fix auto-continue bug: ensure assistant message before continue prompt"
This reverts commit fe96969adb.
2025-12-24 15:52:23 +11:00
Dhanji R. Prasanna
fe96969adb Fix auto-continue bug: ensure assistant message before continue prompt
The auto-continue logic was adding User continue prompts without first
adding an Assistant message when the LLM returned an empty response.
This caused consecutive User messages in the conversation history,
which confused the LLM and caused it to return more empty responses.

The fix ensures an Assistant message is always added before the continue
prompt, using '[empty response]' as a placeholder when the LLM returned
nothing substantive. This maintains proper User/Assistant alternation.
2025-12-24 15:50:30 +11:00
Dhanji R. Prasanna
cd64ebbf87 Add tokens consumed and context percentage to per-tool timing footer
The per-tool timing line now shows:
- Tokens delta (tokens added to context by this tool call)
- Context window usage percentage

Example: └─ ️ 1ms  523tk | 49% ctx

Changes:
- Updated UiWriter trait print_tool_timing signature
- Track tokens before/after adding tool messages to calculate delta
- Updated ConsoleUiWriter, MachineUiWriter, PlannerUiWriter, and test mocks
2025-12-24 15:44:19 +11:00
Dhanji R. Prasanna
fd22ce9890 refactor(g3-core): extract 4 modules from monolithic lib.rs
Reduce lib.rs from 7481 to 6557 lines (-12.4%) by extracting:

- paths.rs: Session/workspace path utilities (get_todo_path, get_logs_dir, etc.)
- streaming_parser.rs: StreamingToolParser for LLM response parsing
- utils.rs: Diff parsing and shell escaping utilities
- webdriver_session.rs: Unified Safari/Chrome WebDriver abstraction

All public APIs preserved via re-exports for backward compatibility.
Added 13 new unit tests across extracted modules.
All 225 tests pass.
2025-12-24 14:32:39 +11:00
Dhanji R. Prasanna
382b905441 duplicate output fix 2025-12-23 17:20:23 +11:00