A fast and efficient library for distributed model serving designed to support large language models.
Discovered on Reddit:r/LocalLLaMA via Reddit:r/LocalLLaMA