Commit Graph

2 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
999ac6fe66 fix: prevent parser poisoning from inline tool-call JSON patterns
The streaming parser was incorrectly detecting tool call patterns that
appeared inline in prose (e.g., when explaining the format), causing
g3 to return control mid-task.

Fix: Modified find_first_tool_call_start() and find_last_tool_call_start()
to only recognize patterns that appear on their own line (at start of
buffer or after newline with only whitespace before the pattern).

Changes:
- Added is_on_own_line() helper to check line-boundary conditions
- Updated detection methods to skip inline patterns
- Removed sanitize_inline_tool_patterns() and LBRACE_HOMOGLYPH (no longer needed)
- Rewrote tests for new behavior
- Added streaming_repro tests that use process_chunk() to verify the exact bug scenario

28 tests covering: streaming repro, line boundaries, Unicode, code contexts, edge cases
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
dc45987e8d Add characterization tests for UTF-8 truncation and parser sanitization
Agent: hopper

Adds 32 new integration tests covering recent commits:

## UTF-8 Safe Truncation Tests (14 tests)
Covers commit f30f145 (Fix UTF-8 panics):
- Topic extraction with emoji, CJK, and multi-byte characters
- Truncation at character boundaries (not byte boundaries)
- Edge cases: exactly 50 chars, 51 chars, 2-byte/3-byte/4-byte UTF-8
- Stub generation with multi-byte topics
- Combining characters and diacritics

## Parser Sanitization Tests (18 tests)
Covers commit 4c36cc0 (Prevent parser poisoning):
- Code block contexts (inline code, after fences, prose)
- Line boundary edge cases (empty lines, whitespace, indentation)
- Unicode handling (emoji, bullets, CJK before patterns)
- Multiple patterns on same line
- Negative cases (similar but different patterns, partial patterns)
- Real-world scenarios from the original bug report

All tests are blackbox/characterization style - they test observable
outputs through stable public interfaces without encoding internal
implementation details.
2026-01-13 11:22:46 +05:30