refactor: extract shared streaming utilities module

Agent: carmack

Create crates/g3-providers/src/streaming.rs with shared helpers:
- decode_utf8_streaming(): Handle incomplete UTF-8 sequences in SSE streams
- is_incomplete_json_error(): Detect incomplete vs malformed JSON
- make_final_chunk(): Create finished completion chunks
- make_text_chunk(): Create text content chunks
- make_tool_chunk(): Create tool call chunks

Refactor anthropic.rs:
- Use shared decode_utf8_streaming (removes 15 lines of inline UTF-8 handling)
- Use make_final_chunk, make_text_chunk, make_tool_chunk helpers
- Reduces verbose CompletionChunk constructions throughout

Refactor databricks.rs:
- Remove local copies of streaming helpers (now uses shared module)
- Reduces duplication between providers

Net reduction: 118 lines removed, 16 lines added (including new module)
All tests pass. Behavior unchanged.
This commit is contained in:
Dhanji R. Prasanna
2026-01-07 12:48:07 +11:00
parent bb63050779
commit 2bf475960c
5 changed files with 101 additions and 118 deletions

View File

@@ -0,0 +1,85 @@
//! Shared utilities for streaming SSE response parsing.
//!
//! This module provides common helpers used by multiple LLM providers
//! for handling Server-Sent Events (SSE) streaming responses.
use crate::{CompletionChunk, ToolCall, Usage};
// ─────────────────────────────────────────────────────────────────────────────
// UTF-8 Streaming
// ─────────────────────────────────────────────────────────────────────────────
/// Try to decode bytes as UTF-8, handling incomplete sequences at the end.
/// Returns the decoded string and leaves any incomplete bytes in the buffer.
pub fn decode_utf8_streaming(byte_buffer: &mut Vec<u8>) -> Option<String> {
match std::str::from_utf8(byte_buffer) {
Ok(s) => {
let result = s.to_string();
byte_buffer.clear();
Some(result)
}
Err(e) => {
let valid_up_to = e.valid_up_to();
if valid_up_to > 0 {
let valid_bytes: Vec<u8> = byte_buffer.drain(..valid_up_to).collect();
// Safe: we just validated these bytes
Some(String::from_utf8(valid_bytes).unwrap())
} else {
None // No valid UTF-8 yet, wait for more bytes
}
}
}
}
// ─────────────────────────────────────────────────────────────────────────────
// JSON Error Detection
// ─────────────────────────────────────────────────────────────────────────────
/// Check if a JSON parse error indicates incomplete data (vs. malformed JSON).
pub fn is_incomplete_json_error(error: &serde_json::Error, data: &str) -> bool {
let msg = error.to_string().to_lowercase();
let looks_incomplete = msg.contains("eof")
|| msg.contains("unterminated")
|| msg.contains("unexpected end")
|| msg.contains("trailing");
let missing_terminator = !data.trim_end().ends_with('}') && !data.trim_end().ends_with(']');
looks_incomplete || missing_terminator
}
// ─────────────────────────────────────────────────────────────────────────────
// Completion Chunk Helpers
// ─────────────────────────────────────────────────────────────────────────────
/// Create a final completion chunk with tool calls and usage.
pub fn make_final_chunk(tool_calls: Vec<ToolCall>, usage: Option<Usage>) -> CompletionChunk {
CompletionChunk {
content: String::new(),
finished: true,
usage,
tool_calls: if tool_calls.is_empty() {
None
} else {
Some(tool_calls)
},
}
}
/// Create a text content chunk (not finished).
pub fn make_text_chunk(content: String) -> CompletionChunk {
CompletionChunk {
content,
finished: false,
usage: None,
tool_calls: None,
}
}
/// Create a tool calls chunk (not finished).
pub fn make_tool_chunk(tool_calls: Vec<ToolCall>) -> CompletionChunk {
CompletionChunk {
content: String::new(),
finished: false,
usage: None,
tool_calls: Some(tool_calls),
}
}