An industrial-grade speech recognition toolkit that offers real-time processing and supports multiple languages along with features like speaker diarization and emotion detection.
Discovered on GitHub via GitHub:modelscope