e0de29fdd0
- Replace backup model stepfun-ai/step-3.5-flash with meta/llama-3.1-8b-instruct (stepfun is a thinking model that uses all tokens on reasoning and never outputs content, causing all 3 fallthroughs to fail) - Add retry with doubled max_tokens when primary model returns empty content (deepseek-v4-flash thinking can exhaust token budget) - Increase backup timeout to 120s and max_tokens to min 2048 - Move callApi error handling to return null instead of throw for cleaner fallthrough logic with timeout logging