Skip to content

Latest commit

 

History

History
84 lines (60 loc) · 6.39 KB

awesome_asr.md

File metadata and controls

84 lines (60 loc) · 6.39 KB

Awesome ASR

Papers

  • Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition, arXiv, 2405.15216, arxiv, pdf, cication: -1

    Zijin Gu, Tatiana Likhomanenko, He Bai, Erik McDermott, Ronan Collobert, Navdeep Jaitly

  • Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping, arXiv, 2404.07341, arxiv, pdf, cication: -1

    Kevin Zhang, Luka Chkhetiani, Francis McCann Ramirez, Yash Khare, Andrea Vanzo, Michael Liang, Sergio Ramirez Martin, Gabriel Oexle, Ruben Bousbib, Taufiquzzaman Peyash

  • Towards a World-English Language Model for On-Device Virtual Assistants, arXiv, 2403.18783, arxiv, pdf, cication: -1

    Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil

  • Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study, arXiv, 2401.12789, arxiv, pdf, cication: -1

    W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath

  • Zipformer: A faster and better encoder for automatic speech recognition, arXiv, 2310.11230, arxiv, pdf, cication: -1

    Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey · (jiqizhixin)

Models

Whisper

  • faster-whisper-server - fedirz Star

  • whisper-diarization - MahmoudAshraf97 Star

    Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

  • asr-diarization - 🤗

  • distil-large-v3 - distil-whisper 🤗

  • OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer, arXiv, 2401.16658, arxiv, pdf, cication: -1

    Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang

  • whisperkit-coreml - argmaxinc 🤗

    · (takeargmax)

  • Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data, arXiv, 2309.13876, arxiv, pdf, cication: -1

    Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan Sharma · (huggingface) · (huggingface)

  • Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling, arXiv, 2311.00430, arxiv, pdf, cication: -1

    Sanchit Gandhi, Patrick von Platen, Alexander M. Rush · (distil-whisper - huggingface) Star

  • whisperX - m-bain Star

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

  • insanely-fast-whisper - Vaibhavs10 Star

  • faster-whisper - guillaumekln Star

    Faster Whisper transcription with CTranslate2

  • whisper_streaming - ufal Star

    Whisper realtime streaming for long speech-to-text transcription and translation

  • Whisper-WebUI - jhj0517 Star

    A Web UI for easy subtitle using whisper model.

  • Speculative Decoding for 2x Faster Whisper Inference


Toolkits

  • NeMo - NVIDIA Star

    NeMo: a toolkit for conversational AI

  • FunASR - alibaba-damo-academy Star

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.

  • automatic-speech-recognition - k2-fsa 🤗

Products