Sullivan, Megan, et al. “Learning When to Reason: Gating LLM Inference for Cost-Efficient Serverless Function Scheduling at Scale”. Academic Journal of Applied Sciences, vol. 2, no. 1, June 2026, pp. 39-45, https://doi.org/10.54097/gwmv0761.