Add prompt cache statistics tracking to /stats command

- Extend Usage struct with cache_creation_tokens and cache_read_tokens fields
- Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens
- Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching
- Add CacheStats struct to Agent for cumulative tracking across API calls
- Add "Prompt Cache Statistics" section to /stats output showing:
  - API call count and cache hit count
  - Hit rate percentage
  - Total input tokens and cache read/creation tokens
  - Cache efficiency (% of input served from cache)
- Update all provider implementations and test files
This commit is contained in:
Dhanji R. Prasanna
2026-01-27 11:32:45 +11:00
parent 96899230a4
commit 5b4079e861
13 changed files with 214 additions and 2 deletions

View File

@@ -1,5 +1,5 @@
# Workspace Memory
> Updated: 2026-01-20T10:16:13Z | Size: 18.3k chars
> Updated: 2026-01-27T00:12:18Z | Size: 19.5k chars
### Remember Tool Wiring
- `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()`
@@ -324,4 +324,26 @@ Centralized logic for determining how to display tool execution results.
- `is_compact_tool()` [147..162] - checks if tool uses one-line summaries (read_file, write_file, str_replace, etc.)
- `is_self_handled_tool()` [164..167] - checks if tool handles own output (todo_read, todo_write)
- `format_compact_tool_summary()` [169..185] - dispatches to format_*_summary() based on tool name
- `parse_diff_stats()` [187..210] - parses "+N insertions | -M deletions" from str_replace result
- `parse_diff_stats()` [187..210] - parses "+N insertions | -M deletions" from str_replace result
### Prompt Cache Statistics Tracking
Tracks prompt/prefix caching efficacy across Anthropic and OpenAI providers.
- `crates/g3-providers/src/lib.rs`
- `Usage` [195..210] - added `cache_creation_tokens` and `cache_read_tokens` fields with `#[serde(default)]`
- `crates/g3-providers/src/anthropic.rs`
- `AnthropicUsage` [944..956] - parses `cache_creation_input_tokens` and `cache_read_input_tokens`
- `crates/g3-providers/src/openai.rs`
- `OpenAIUsage` [494..510] - parses `prompt_tokens_details.cached_tokens`
- `OpenAIPromptTokensDetails` [504..510] - nested struct for prompt token details
- `crates/g3-core/src/lib.rs`
- `CacheStats` [75..90] - cumulative cache statistics struct with `total_cache_creation_tokens`, `total_cache_read_tokens`, `total_input_tokens`, `cache_hit_calls`, `total_calls`
- `Agent.cache_stats` [106] - field tracking cumulative cache stats
- Cache stats updated in `stream_completion_with_tools()` [2140..2150] when usage data received
- `crates/g3-core/src/stats.rs`
- `AgentStatsSnapshot.cache_stats` [20] - reference to cache stats for formatting
- `format_cache_stats()` [189..230] - formats cache statistics section with hit rate and efficiency metrics