Lightning Talk: Decoding and Taming the Costs of Serving Large Language Models - Yuan Chen, NVIDIA

Name: Lightning Talk: Decoding and Taming the Costs of Serving Large Language Models - Yuan Chen, NVIDIA
Uploaded: 2024-03-26T00:00:00.000Z
Duration: 314 s
Description: Video Lightning Talk: Decoding and Taming the Costs of Serving Large Language Models - Yuan Chen, NVIDIA