Return to Issue Details Learning When to Reason: Gating LLM Inference for Cost-Efficient Serverless Function Scheduling at Scale Download Download PDF