Add prompt cache statistics tracking to /stats command
- Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files
This commit is contained in:
@@ -196,6 +196,12 @@ pub struct Usage {
|
||||
pub prompt_tokens: u32,
|
||||
pub completion_tokens: u32,
|
||||
pub total_tokens: u32,
|
||||
/// Tokens written to cache (Anthropic: cache_creation_input_tokens)
|
||||
#[serde(default)]
|
||||
pub cache_creation_tokens: u32,
|
||||
/// Tokens read from cache (Anthropic: cache_read_input_tokens, OpenAI: cached_tokens)
|
||||
#[serde(default)]
|
||||
pub cache_read_tokens: u32,
|
||||
}
|
||||
|
||||
pub type CompletionStream = tokio_stream::wrappers::ReceiverStream<Result<CompletionChunk>>;
|
||||
|
||||
Reference in New Issue
Block a user