alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	1b26de6cd2	Fix context window progress bar showing wrong token counts Calibrate used_tokens from API prompt_tokens (ground truth) to fix progress bar drift in interactive mode. Three issues fixed: 1. update_usage_from_response() only updated cumulative_tokens, never calibrated used_tokens. Now snaps used_tokens to prompt_tokens when available (falls back to heuristic when prompt_tokens is 0). 2. Moved calibration call inline during streaming (when usage chunk arrives) instead of after the loop. Text-only responses — the most common case in interactive mode — take an early return path that bypassed the post-loop usage update entirely. 3. Removed mock Usage with hardcoded prompt_tokens=100 from execute_single_task() which corrupted calibration.	2026-03-18 15:31:20 +11:00
Dhanji R. Prasanna	0410efd41b	Add 1% safety buffer to context window to prevent API token limit errors Our token estimation heuristic (chars/3 * 1.1 for code, chars/4 * 1.1 for text) slightly undercounts over long sessions with hundreds of tool calls. This accumulated drift of ~89 tokens caused Anthropic API 400 errors: 'prompt is too long: 200089 tokens > 200000 maximum' Fix: ContextWindow::new() now applies a 1% buffer, setting total_tokens to 99% of the provider-reported limit. For a 200k window this gives 198k, providing a 2000-token safety margin that absorbs estimation drift. All percentage calculations, compaction thresholds, and thinning triggers operate against the buffered limit, so compaction fires earlier and we never send a request the API will reject.	2026-02-13 15:46:53 +11:00
Dhanji R. Prasanna	5b4079e861	Add prompt cache statistics tracking to /stats command - Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files	2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna	5bfaee8dd5	use consistent naming for compaction	2026-01-08 12:54:03 +11:00
Dhanji R. Prasanna	1b4ea93ba4	token counting bugfix	2025-12-01 14:52:10 +11:00
Jochen	52f78653b4	add context window monitor Writes the current context window to logs/current_context_window (uses a symlink to a session ID). This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.	2025-11-27 21:00:02 +11:00
Dhanji Prasanna	4a819e8f27	context window counting bug	2025-10-10 14:40:10 +11:00
Dhanji Prasanna	260c949576	token counting fixes	2025-10-09 12:11:21 +11:00

8 Commits