-
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition,
arXiv, 2405.15216
, arxiv, pdf, cication: -1Zijin Gu, Tatiana Likhomanenko, He Bai, Erik McDermott, Ronan Collobert, Navdeep Jaitly
-
Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping,
arXiv, 2404.07341
, arxiv, pdf, cication: -1Kevin Zhang, Luka Chkhetiani, Francis McCann Ramirez, Yash Khare, Andrea Vanzo, Michael Liang, Sergio Ramirez Martin, Gabriel Oexle, Ruben Bousbib, Taufiquzzaman Peyash
-
Towards a World-English Language Model for On-Device Virtual Assistants,
arXiv, 2403.18783
, arxiv, pdf, cication: -1Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil
-
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study,
arXiv, 2401.12789
, arxiv, pdf, cication: -1W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath
-
Zipformer: A faster and better encoder for automatic speech recognition,
arXiv, 2310.11230
, arxiv, pdf, cication: -1Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey · (jiqizhixin)
- parakeet-tdt-1.1b - nvidia 🤗
- open_asr_leaderboard - hf-audio 🤗
- parakeet-ctc-0.6b - nvidia 🤗
-
faster-whisper-server - fedirz
-
whisper-diarization - MahmoudAshraf97
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
-
asr-diarization - 🤗
-
distil-large-v3 - distil-whisper 🤗
-
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer,
arXiv, 2401.16658
, arxiv, pdf, cication: -1Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang
-
whisperkit-coreml - argmaxinc 🤗
· (takeargmax)
-
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data,
arXiv, 2309.13876
, arxiv, pdf, cication: -1Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan Sharma · (huggingface) · (huggingface)
-
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling,
arXiv, 2311.00430
, arxiv, pdf, cication: -1Sanchit Gandhi, Patrick von Platen, Alexander M. Rush · (distil-whisper - huggingface)
-
whisperX - m-bain
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
-
insanely-fast-whisper - Vaibhavs10
-
faster-whisper - guillaumekln
Faster Whisper transcription with CTranslate2
-
whisper_streaming - ufal
Whisper realtime streaming for long speech-to-text transcription and translation
-
Whisper-WebUI - jhj0517
A Web UI for easy subtitle using whisper model.
- ratchet-whisper - FL33TW00D-HF 🤗
-
NeMo - NVIDIA
NeMo: a toolkit for conversational AI
-
FunASR - alibaba-damo-academy
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.
-
automatic-speech-recognition - k2-fsa 🤗