Framework for efficient inference of text-generation models, enabling optimal performance for varied applications.
Discovered on HuggingFace via HuggingFace:unknown