Research

Andes: Defining and Enhancing Quality-of-Experience in LLM Serving

May 2023 – April 2024; Supervisor: Mosharaf Chowdhury @ Symbiotic Lab Here’s a more concise version:

  • Identified that in LLM text-streaming services, generating text faster than user reading speed is crucial for enhancing user experience, addressing gaps in prior metrics.
  • Defined Quality of Experience (QoE) in LLM serving by tracking each step of text generation and monitoring the overall user experience throughout the entire streaming process.
  • Formulated QoE optimization as a knapsack problem and developed a scheduling algorithm to maximize QoE by efficiently allocating resources.
  • Built Andes, an LLM serving system on top of vLLM, integrating the scheduling algorithm to enhance QoE in real-time LLM services.
  • Co-authored the paper “Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services” as second author.

Publications