Sullivan, Megan, Boyang He, and Patrick Evans. “Learning When to Reason: Gating LLM Inference for Cost-Efficient Serverless Function Scheduling at Scale”. Academic Journal of Applied Sciences 2, no. 1 (June 9, 2026): 39–45. Accessed June 12, 2026. https://asciences.org/index.php/ojs/article/view/65.