fix: store tool calls structurally for proper API roundtripping

The agent would stop mid-task because native tool calls were stored as
inline JSON text in Message.content. When sent back to the Anthropic API
via convert_messages(), they went as plain text instead of structured
tool_use/tool_result blocks. The model would occasionally get confused
and emit text describing what it wanted to do instead of invoking the
tool mechanism.

Changes:
- Add MessageToolCall struct and tool_calls/tool_result_id fields to Message
- Add id field to core ToolCall struct to preserve provider tool call IDs
- Update Anthropic convert_messages() to emit tool_use and tool_result blocks
- Add ToolResult variant to AnthropicContent enum
- Store tool calls structurally in tool message construction (not inline JSON)
- Fix add_message() to preserve empty-content messages with tool_calls
- Fix check_duplicate_in_previous_message() to check structured tool_calls
- Generate valid IDs for JSON fallback tool calls (Anthropic pattern requirement)
- Update planner create_tool_message() to use structured tool calls
This commit is contained in:
Dhanji R. Prasanna
2026-02-11 08:48:07 +11:00
parent 2a4cd1f4d6
commit d3f0112f46
15 changed files with 355 additions and 53 deletions

View File

@@ -1,5 +1,5 @@
# Workspace Memory
> Updated: 2026-02-07T05:28:12Z | Size: 26.3k chars
> Updated: 2026-02-10T21:08:38Z | Size: 27.8k chars
### Remember Tool Wiring
- `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()`
@@ -429,4 +429,17 @@ Makes tool output responsive to terminal width - no line wrapping, with 4-char r
- **Bug**: The `_ =>` catch-all in when condition evaluation did naive string `contains` check. For `Matches` (regex like `^Re: `), it checked if fact values literally contained the regex pattern string — which never matched. Result: when conditions with `matches` rule always evaluated as not-met → vacuous pass → violations slipped through.
- **Fix**: Replaced hand-rolled when evaluation with synthetic `CompiledPredicate` delegation to `evaluate_predicate_datalog()`, which handles all 12 rule types correctly.
- **Tests**: `test_execute_rules_when_matches_condition_met`, `test_execute_rules_when_matches_condition_met_but_predicate_fails`, `test_execute_rules_when_matches_condition_not_met`
- **Note**: The `invariants.rs` path was NOT affected — it already delegated to `evaluate_predicate()` which handles all rules.
- **Note**: The `invariants.rs` path was NOT affected — it already delegated to `evaluate_predicate()` which handles all rules.
### Structured Tool Call Messages (2026-02-11)
- `crates/g3-providers/src/lib.rs` [102..106] - `MessageToolCall` struct (id, name, input)
- `crates/g3-providers/src/lib.rs` [124..131] - `Message.tool_calls: Vec<MessageToolCall>`, `Message.tool_result_id: Option<String>`
- `crates/g3-providers/src/anthropic.rs` [284..340] - `convert_messages()` emits `tool_use` blocks for assistant messages with `tool_calls`, `tool_result` blocks for user messages with `tool_result_id`
- `crates/g3-providers/src/anthropic.rs` [935..941] - `AnthropicContent::ToolResult` variant added
- `crates/g3-core/src/lib.rs` [82..88] - `ToolCall.id` field added (from native providers)
- `crates/g3-core/src/lib.rs` [2530..2545] - Tool messages store structured `tool_calls` instead of inline JSON text
- `crates/g3-core/src/lib.rs` [1385..1400] - `check_duplicate_in_previous_message()` checks structured `tool_calls` field
- `crates/g3-core/src/context_window.rs` [107..109] - `add_message_with_tokens()` preserves messages with `tool_calls` even if content is empty
- `crates/g3-core/src/streaming_parser.rs` [339] - `process_chunk()` preserves tool call `id` from provider
**Bug fixed**: Agent would stop mid-task because native tool calls were stored as inline JSON text in `Message.content`. When sent back to Anthropic API via `convert_messages()`, they went as plain text instead of structured `tool_use`/`tool_result` blocks. The model would occasionally get confused and emit text describing what it wanted to do instead of invoking the tool mechanism.