An industrial-grade speech recognition toolkit that supports real-time processing in over 50 languages with features like speaker diarization and emotion detection.
Discovered on GitHub via GitHub:modelscope