Sullivan M, He B, Evans P. Learning When to Reason: Gating LLM Inference for Cost-Efficient Serverless Function Scheduling at Scale. AJAS [Internet]. 2026 Jun. 9 [cited 2026 Jun. 12];2(1):39-45. Available from: https://asciences.org/index.php/ojs/article/view/65