Commit Graph

36 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
1ad74baaa5 Readability refactor: extract mega-functions into focused helpers
Agent: carmack

4 files refactored, net -250 lines, all tests passing (417 + 71).

datalog.rs:
- Extract 7 predicate evaluation helpers from evaluate_predicate_datalog()
  (~200-line match → 12-line dispatch table)
- Extract rule_body_for_predicate() from format_datalog_program()
  (~75-line match → 2-line call)

invariants.rs:
- Extract 7 per-rule helpers from evaluate_predicate()
  (~230-line match → 12-line dispatch table)

envelope.rs:
- Simplify summary construction in verify_envelope()
- Eliminate redundant clone in stamp_envelope()

anthropic.rs:
- Introduce StreamState struct with 6 handler methods
- parse_streaming_response: ~290 lines → ~90 lines
- Max nesting depth reduced from 8 to 4 levels
2026-02-13 16:21:38 +11:00
Dhanji R. Prasanna
a7e0b0ef9e Refactor: deduplicate JSON parsing, provider constructors, and identity function
Agent: fowler

Eliminate code-path aliasing and near-duplicates across recent commits:

1. Deduplicate find_json_object_end: Three near-identical copies in
   streaming_parser.rs, context_window.rs, and acd.rs consolidated into
   a single canonical implementation in utils.rs. All callers now route
   through the canonical version. The utils.rs version uses the most
   defensive variant (with found_start guard). (-84 lines)

2. Deduplicate provider constructors: AnthropicProvider::new() and
   GeminiProvider::new() now delegate to their respective new_with_name()
   methods instead of duplicating the full constructor body.
   (OpenAI already delegated.) (-28 lines)

3. Inline convert_cache_control: Removed identity function that just
   cloned CacheControl. Call sites now use .map(|cc| cc.clone())
   directly. (-4 lines)

Net: -65 lines, 0 behavior changes, all 683 library tests pass.
2026-02-13 12:37:09 +11:00
Dhanji R. Prasanna
fcb839e5fd fix: nest images inside tool_result content for Anthropic API compliance
read_image tool results placed images as top-level Image content blocks
alongside ToolResult blocks in user messages. The Anthropic API rejects
this combination, reporting orphaned tool_use IDs even though the
tool_result was present — the malformed message structure prevented
the API from recognizing it as a valid tool result.

Added ToolResultContent enum (Text | Blocks) with custom serde so that
when images are attached to a tool result, they are nested inside the
tool_result content array as structured blocks, matching the Anthropic
API's expected format for multi-modal tool results.

Regular tool results (no images) continue to use simple string content.
Regular user messages (not tool results) continue to use top-level
Image blocks.

4 new tests covering image nesting, string fallback, regular user
messages, and orphan detection with structured content.
2026-02-13 10:50:52 +11:00
Dhanji R. Prasanna
d61be719c2 fix: strip orphaned tool_calls from preserved assistant message during compaction
After context compaction, the preserved last assistant message retained
its structured tool_calls field, but the corresponding tool_result was
summarized away. This created orphaned tool_use blocks that violated
the Anthropic API constraint: 'Each tool_use block must have a
corresponding tool_result block in the next message', causing 400 errors.

Primary fix: clear tool_calls from the preserved assistant message in
extract_preserved_messages(). The tool call was already executed and
its result is captured in the summary.

Defense-in-depth: added strip_orphaned_tool_use() post-processing in
Anthropic convert_messages() to detect and strip any orphaned tool_use
blocks before they reach the API.

Added 7 tests: 3 unit tests for compaction stripping, 3 unit tests for
Anthropic orphan detection, 1 integration test reproducing the exact
bug scenario from the h3 session.
2026-02-11 15:22:03 +11:00
Dhanji R. Prasanna
d3f0112f46 fix: store tool calls structurally for proper API roundtripping
The agent would stop mid-task because native tool calls were stored as
inline JSON text in Message.content. When sent back to the Anthropic API
via convert_messages(), they went as plain text instead of structured
tool_use/tool_result blocks. The model would occasionally get confused
and emit text describing what it wanted to do instead of invoking the
tool mechanism.

Changes:
- Add MessageToolCall struct and tool_calls/tool_result_id fields to Message
- Add id field to core ToolCall struct to preserve provider tool call IDs
- Update Anthropic convert_messages() to emit tool_use and tool_result blocks
- Add ToolResult variant to AnthropicContent enum
- Store tool calls structurally in tool message construction (not inline JSON)
- Fix add_message() to preserve empty-content messages with tool_calls
- Fix check_duplicate_in_previous_message() to check structured tool_calls
- Generate valid IDs for JSON fallback tool calls (Anthropic pattern requirement)
- Update planner create_tool_message() to use structured tool calls
2026-02-11 08:48:07 +11:00
Dhanji R. Prasanna
5b4079e861 Add prompt cache statistics tracking to /stats command
- Extend Usage struct with cache_creation_tokens and cache_read_tokens fields
- Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens
- Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching
- Add CacheStats struct to Agent for cumulative tracking across API calls
- Add "Prompt Cache Statistics" section to /stats output showing:
  - API call count and cache hit count
  - Hit rate percentage
  - Total input tokens and cache read/creation tokens
  - Cache efficiency (% of input served from cache)
- Update all provider implementations and test files
2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna
0ae1a13cdb feat: real-time tool call streaming indicator with blinking UI
- Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback
- New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active()
- Refactor ConsoleUiWriter state to use atomics in ParsingHintState
- Add tool_call_streaming field to CompletionChunk for provider hints
- Anthropic provider sends streaming hints when tool name detected
- New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active()

Parser improvements:
- Add is_json_invalidated() to detect false positive tool patterns
- Fix tool result poisoning when file contents contain partial JSON
- Unescaped newlines in strings or prose after JSON invalidates detection

User sees ' ● tool_name |' immediately when tool call starts streaming,
with blinking indicator while args are received.
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
f4562cd4c9 config: default agent settings and provider override 2026-01-14 20:14:33 +05:30
Dhanji R. Prasanna
e301075666 Fix panic on multi-byte chars in filter_json buffer truncation
The buffer truncation code was slicing at a raw byte offset which could
land in the middle of a multi-byte character (like emojis), causing a
panic. Fixed by using char_indices() to find valid character boundaries.

Also added stop_reason field to CompletionChunk initializers in tests
to complete the stop_reason feature addition.

- Fix byte boundary panic in filter_json.rs line 327
- Add test for multi-byte character handling
- Update test files with missing stop_reason field
2026-01-09 15:20:57 +11:00
Dhanji R. Prasanna
2bf475960c refactor: extract shared streaming utilities module
Agent: carmack

Create crates/g3-providers/src/streaming.rs with shared helpers:
- decode_utf8_streaming(): Handle incomplete UTF-8 sequences in SSE streams
- is_incomplete_json_error(): Detect incomplete vs malformed JSON
- make_final_chunk(): Create finished completion chunks
- make_text_chunk(): Create text content chunks
- make_tool_chunk(): Create tool call chunks

Refactor anthropic.rs:
- Use shared decode_utf8_streaming (removes 15 lines of inline UTF-8 handling)
- Use make_final_chunk, make_text_chunk, make_tool_chunk helpers
- Reduces verbose CompletionChunk constructions throughout

Refactor databricks.rs:
- Remove local copies of streaming helpers (now uses shared module)
- Reduces duplication between providers

Net reduction: 118 lines removed, 16 lines added (including new module)
All tests pass. Behavior unchanged.
2026-01-07 12:48:07 +11:00
Dhanji R. Prasanna
3601cc0547 Enhance read_image tool with magic byte detection and multi-image support
- Fix media type detection using magic bytes instead of file extension
  - Correctly identifies JPEG files with .png extension (and vice versa)
  - Supports PNG, JPEG, GIF, and WebP formats

- Add multi-image support with file_paths array parameter
  - Load multiple images in a single tool call
  - All images queued for LLM analysis

- Enhanced CLI output:
  - Inline image preview via iTerm2 imgcat protocol (height=5)
  - Dimmed info line showing: path | dimensions | media type | file size
  - Proper │ prefix alignment with tool output boxing
  - Human-readable file sizes (bytes, KB, MB)

- Add image dimension extraction from file headers
  - PNG, JPEG, GIF, WebP dimension parsing

- Add comprehensive tests for magic byte detection and dimensions
2025-12-26 11:19:37 +11:00
Dhanji R. Prasanna
3ece02ff31 fix: resolve compiler warnings across crates
- Remove unused assignment to final_output_called (returns immediately after)
- Mark cache_config field as #[allow(dead_code)] (reserved for future use)
- Mark print_status_line method as #[allow(dead_code)] (reserved for future use)
2025-12-25 18:47:22 +11:00
Dhanji R. Prasanna
923def0ab2 Convert all INFO logs to DEBUG to reduce CLI noise
Converted ~77 info! macro calls to debug! across the codebase to prevent
log messages from interrupting the CLI experience during normal operation.
Users can still see these logs by setting RUST_LOG=debug if needed.

Affected crates:
- g3-cli
- g3-computer-control
- g3-console
- g3-core
- g3-ensembles
- g3-execution
- g3-providers
2025-12-22 16:27:35 +11:00
Dhanji R. Prasanna
b4f6da6bf2 duplicate tool call bugfix 2025-12-19 15:24:03 +11:00
Jochen
ff8b3e7c7b Implement planning mode 2025-12-09 17:03:53 +11:00
Jochen
4aa84e2144 disable thinking if there is no token budget 2025-12-09 16:45:28 +11:00
Jochen
fb2cf6f898 fix for thinking budget and hardcoded max token on summary 2025-12-09 12:41:52 +11:00
Jochen
ae16243f49 Fix temperature param + add thinking for anthropic
The temperature param was not passed to the llm.
Now support anthropic models in 'thinking' mode.
2025-12-02 17:24:55 +11:00
Dhanji R. Prasanna
8928fb92be append instead of replace system msg 2025-11-29 16:13:00 +11:00
Jochen
52f78653b4 add context window monitor
Writes the current context window to logs/current_context_window (uses a symlink to a session ID).

This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.
2025-11-27 21:00:02 +11:00
Jochen
ad198a8501 add code exploration fast start
This tries to short-circuit multiple round-trips to llm for reading code.
It's a precursor to trying to context engineer tailored to specific tasks.
In initial experiments, it's only marginally faster than regular mode, and burns more tokens.
2025-11-25 22:51:32 +11:00
Jochen
a150ba6a55 adds ttl to cache control 2025-11-18 23:23:49 +11:00
Jochen
296bf5a449 adds cache_control 2025-11-18 22:38:52 +11:00
Dhanji R. Prasanna
cef234d91a more color 2025-11-06 13:51:58 +11:00
Dhanji R. Prasanna
d78732df14 colors 2025-11-06 13:41:06 +11:00
Dhanji R. Prasanna
d007e8f471 improve code_search nudge and increase anthropic tmieout 2025-11-05 15:05:29 +11:00
Dhanji Prasanna
f89bbfc89a fix final_output bug 2025-10-31 14:48:36 +11:00
Jochen
010a43d203 coach/player provider split + add OpenAI
Allows coach and player LLM providers to be separately specified.
Also adds OpenAI provider
2025-10-21 16:59:13 +11:00
Dhanji Prasanna
bb90cc7826 some fixes 2025-10-14 12:44:02 +11:00
Dhanji Prasanna
260c949576 token counting fixes 2025-10-09 12:11:21 +11:00
Dhanji Prasanna
046b54c49b move embedded provider to a better crate 2025-10-01 15:19:37 +10:00
Dhanji Prasanna
5ef4a74468 minor 2025-09-26 21:38:01 +10:00
Dhanji Prasanna
dd20e0bb01 some cleanup of converstation mgmt 2025-09-22 20:38:44 +10:00
Dhanji Prasanna
64d2ac53a8 tweaks for streaming toolcalls 2025-09-22 14:25:04 +10:00
Dhanji Prasanna
9a5486f2a8 Fix for tool use 2025-09-20 20:17:50 +10:00
Dhanji Prasanna
444245d7dd Readd the anthropic provider 2025-09-20 18:40:51 +10:00