alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	1ad74baaa5	Readability refactor: extract mega-functions into focused helpers Agent: carmack 4 files refactored, net -250 lines, all tests passing (417 + 71). datalog.rs: - Extract 7 predicate evaluation helpers from evaluate_predicate_datalog() (~200-line match → 12-line dispatch table) - Extract rule_body_for_predicate() from format_datalog_program() (~75-line match → 2-line call) invariants.rs: - Extract 7 per-rule helpers from evaluate_predicate() (~230-line match → 12-line dispatch table) envelope.rs: - Simplify summary construction in verify_envelope() - Eliminate redundant clone in stamp_envelope() anthropic.rs: - Introduce StreamState struct with 6 handler methods - parse_streaming_response: ~290 lines → ~90 lines - Max nesting depth reduced from 8 to 4 levels	2026-02-13 16:21:38 +11:00
Dhanji R. Prasanna	a7e0b0ef9e	Refactor: deduplicate JSON parsing, provider constructors, and identity function Agent: fowler Eliminate code-path aliasing and near-duplicates across recent commits: 1. Deduplicate find_json_object_end: Three near-identical copies in streaming_parser.rs, context_window.rs, and acd.rs consolidated into a single canonical implementation in utils.rs. All callers now route through the canonical version. The utils.rs version uses the most defensive variant (with found_start guard). (-84 lines) 2. Deduplicate provider constructors: AnthropicProvider::new() and GeminiProvider::new() now delegate to their respective new_with_name() methods instead of duplicating the full constructor body. (OpenAI already delegated.) (-28 lines) 3. Inline convert_cache_control: Removed identity function that just cloned CacheControl. Call sites now use .map(\|cc\| cc.clone()) directly. (-4 lines) Net: -65 lines, 0 behavior changes, all 683 library tests pass.	2026-02-13 12:37:09 +11:00
Dhanji R. Prasanna	fcb839e5fd	fix: nest images inside tool_result content for Anthropic API compliance read_image tool results placed images as top-level Image content blocks alongside ToolResult blocks in user messages. The Anthropic API rejects this combination, reporting orphaned tool_use IDs even though the tool_result was present — the malformed message structure prevented the API from recognizing it as a valid tool result. Added ToolResultContent enum (Text \| Blocks) with custom serde so that when images are attached to a tool result, they are nested inside the tool_result content array as structured blocks, matching the Anthropic API's expected format for multi-modal tool results. Regular tool results (no images) continue to use simple string content. Regular user messages (not tool results) continue to use top-level Image blocks. 4 new tests covering image nesting, string fallback, regular user messages, and orphan detection with structured content.	2026-02-13 10:50:52 +11:00
Dhanji R. Prasanna	d61be719c2	fix: strip orphaned tool_calls from preserved assistant message during compaction After context compaction, the preserved last assistant message retained its structured tool_calls field, but the corresponding tool_result was summarized away. This created orphaned tool_use blocks that violated the Anthropic API constraint: 'Each tool_use block must have a corresponding tool_result block in the next message', causing 400 errors. Primary fix: clear tool_calls from the preserved assistant message in extract_preserved_messages(). The tool call was already executed and its result is captured in the summary. Defense-in-depth: added strip_orphaned_tool_use() post-processing in Anthropic convert_messages() to detect and strip any orphaned tool_use blocks before they reach the API. Added 7 tests: 3 unit tests for compaction stripping, 3 unit tests for Anthropic orphan detection, 1 integration test reproducing the exact bug scenario from the h3 session.	2026-02-11 15:22:03 +11:00
Dhanji R. Prasanna	d3f0112f46	fix: store tool calls structurally for proper API roundtripping The agent would stop mid-task because native tool calls were stored as inline JSON text in Message.content. When sent back to the Anthropic API via convert_messages(), they went as plain text instead of structured tool_use/tool_result blocks. The model would occasionally get confused and emit text describing what it wanted to do instead of invoking the tool mechanism. Changes: - Add MessageToolCall struct and tool_calls/tool_result_id fields to Message - Add id field to core ToolCall struct to preserve provider tool call IDs - Update Anthropic convert_messages() to emit tool_use and tool_result blocks - Add ToolResult variant to AnthropicContent enum - Store tool calls structurally in tool message construction (not inline JSON) - Fix add_message() to preserve empty-content messages with tool_calls - Fix check_duplicate_in_previous_message() to check structured tool_calls - Generate valid IDs for JSON fallback tool calls (Anthropic pattern requirement) - Update planner create_tool_message() to use structured tool calls	2026-02-11 08:48:07 +11:00
Dhanji R. Prasanna	5b4079e861	Add prompt cache statistics tracking to /stats command - Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files	2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna	0ae1a13cdb	feat: real-time tool call streaming indicator with blinking UI - Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback - New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active() - Refactor ConsoleUiWriter state to use atomics in ParsingHintState - Add tool_call_streaming field to CompletionChunk for provider hints - Anthropic provider sends streaming hints when tool name detected - New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active() Parser improvements: - Add is_json_invalidated() to detect false positive tool patterns - Fix tool result poisoning when file contents contain partial JSON - Unescaped newlines in strings or prose after JSON invalidates detection User sees ' ● tool_name \|' immediately when tool call starts streaming, with blinking indicator while args are received.	2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna	f4562cd4c9	config: default agent settings and provider override	2026-01-14 20:14:33 +05:30
Dhanji R. Prasanna	e301075666	Fix panic on multi-byte chars in filter_json buffer truncation The buffer truncation code was slicing at a raw byte offset which could land in the middle of a multi-byte character (like emojis), causing a panic. Fixed by using char_indices() to find valid character boundaries. Also added stop_reason field to CompletionChunk initializers in tests to complete the stop_reason feature addition. - Fix byte boundary panic in filter_json.rs line 327 - Add test for multi-byte character handling - Update test files with missing stop_reason field	2026-01-09 15:20:57 +11:00
Dhanji R. Prasanna	2bf475960c	refactor: extract shared streaming utilities module Agent: carmack Create crates/g3-providers/src/streaming.rs with shared helpers: - decode_utf8_streaming(): Handle incomplete UTF-8 sequences in SSE streams - is_incomplete_json_error(): Detect incomplete vs malformed JSON - make_final_chunk(): Create finished completion chunks - make_text_chunk(): Create text content chunks - make_tool_chunk(): Create tool call chunks Refactor anthropic.rs: - Use shared decode_utf8_streaming (removes 15 lines of inline UTF-8 handling) - Use make_final_chunk, make_text_chunk, make_tool_chunk helpers - Reduces verbose CompletionChunk constructions throughout Refactor databricks.rs: - Remove local copies of streaming helpers (now uses shared module) - Reduces duplication between providers Net reduction: 118 lines removed, 16 lines added (including new module) All tests pass. Behavior unchanged.	2026-01-07 12:48:07 +11:00
Dhanji R. Prasanna	3601cc0547	Enhance read_image tool with magic byte detection and multi-image support - Fix media type detection using magic bytes instead of file extension - Correctly identifies JPEG files with .png extension (and vice versa) - Supports PNG, JPEG, GIF, and WebP formats - Add multi-image support with file_paths array parameter - Load multiple images in a single tool call - All images queued for LLM analysis - Enhanced CLI output: - Inline image preview via iTerm2 imgcat protocol (height=5) - Dimmed info line showing: path \| dimensions \| media type \| file size - Proper │ prefix alignment with tool output boxing - Human-readable file sizes (bytes, KB, MB) - Add image dimension extraction from file headers - PNG, JPEG, GIF, WebP dimension parsing - Add comprehensive tests for magic byte detection and dimensions	2025-12-26 11:19:37 +11:00
Dhanji R. Prasanna	3ece02ff31	fix: resolve compiler warnings across crates - Remove unused assignment to final_output_called (returns immediately after) - Mark cache_config field as #[allow(dead_code)] (reserved for future use) - Mark print_status_line method as #[allow(dead_code)] (reserved for future use)	2025-12-25 18:47:22 +11:00
Dhanji R. Prasanna	923def0ab2	Convert all INFO logs to DEBUG to reduce CLI noise Converted ~77 info! macro calls to debug! across the codebase to prevent log messages from interrupting the CLI experience during normal operation. Users can still see these logs by setting RUST_LOG=debug if needed. Affected crates: - g3-cli - g3-computer-control - g3-console - g3-core - g3-ensembles - g3-execution - g3-providers	2025-12-22 16:27:35 +11:00
Dhanji R. Prasanna	b4f6da6bf2	duplicate tool call bugfix	2025-12-19 15:24:03 +11:00
Jochen	ff8b3e7c7b	Implement planning mode	2025-12-09 17:03:53 +11:00
Jochen	4aa84e2144	disable thinking if there is no token budget	2025-12-09 16:45:28 +11:00
Jochen	fb2cf6f898	fix for thinking budget and hardcoded max token on summary	2025-12-09 12:41:52 +11:00
Jochen	ae16243f49	Fix temperature param + add thinking for anthropic The temperature param was not passed to the llm. Now support anthropic models in 'thinking' mode.	2025-12-02 17:24:55 +11:00
Dhanji R. Prasanna	8928fb92be	append instead of replace system msg	2025-11-29 16:13:00 +11:00
Jochen	52f78653b4	add context window monitor Writes the current context window to logs/current_context_window (uses a symlink to a session ID). This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.	2025-11-27 21:00:02 +11:00
Jochen	ad198a8501	add code exploration fast start This tries to short-circuit multiple round-trips to llm for reading code. It's a precursor to trying to context engineer tailored to specific tasks. In initial experiments, it's only marginally faster than regular mode, and burns more tokens.	2025-11-25 22:51:32 +11:00
Jochen	a150ba6a55	adds ttl to cache control	2025-11-18 23:23:49 +11:00
Jochen	296bf5a449	adds cache_control	2025-11-18 22:38:52 +11:00
Dhanji R. Prasanna	cef234d91a	more color	2025-11-06 13:51:58 +11:00
Dhanji R. Prasanna	d78732df14	colors	2025-11-06 13:41:06 +11:00
Dhanji R. Prasanna	d007e8f471	improve code_search nudge and increase anthropic tmieout	2025-11-05 15:05:29 +11:00
Dhanji Prasanna	f89bbfc89a	fix final_output bug	2025-10-31 14:48:36 +11:00
Jochen	010a43d203	coach/player provider split + add OpenAI Allows coach and player LLM providers to be separately specified. Also adds OpenAI provider	2025-10-21 16:59:13 +11:00
Dhanji Prasanna	bb90cc7826	some fixes	2025-10-14 12:44:02 +11:00
Dhanji Prasanna	260c949576	token counting fixes	2025-10-09 12:11:21 +11:00
Dhanji Prasanna	046b54c49b	move embedded provider to a better crate	2025-10-01 15:19:37 +10:00
Dhanji Prasanna	5ef4a74468	minor	2025-09-26 21:38:01 +10:00
Dhanji Prasanna	dd20e0bb01	some cleanup of converstation mgmt	2025-09-22 20:38:44 +10:00
Dhanji Prasanna	64d2ac53a8	tweaks for streaming toolcalls	2025-09-22 14:25:04 +10:00
Dhanji Prasanna	9a5486f2a8	Fix for tool use	2025-09-20 20:17:50 +10:00
Dhanji Prasanna	444245d7dd	Readd the anthropic provider	2025-09-20 18:40:51 +10:00

36 Commits