The resolve_max_tokens() function was returning 2048 for embedded providers, which caused responses to be truncated prematurely. Increased to 8192 to allow the provider's own effective_max_tokens() calculation to work properly.
The resolve_max_tokens() function was returning 2048 for embedded providers, which caused responses to be truncated prematurely. Increased to 8192 to allow the provider's own effective_max_tokens() calculation to work properly.