Hugging Face Audio coursework
-
Updated
Sep 7, 2023 - Jupyter Notebook
Hugging Face Audio coursework
Created an ASR (Automatic Speech Recognition) system that takes in individual recordings. Each recording represents a sentence composed of 5-10 English language digits, separated by adequate pauses. The system involves segmenting the sentence using a classifier, differentiating between background and foreground sounds.
Whisper Transcription Service
ASR course past paper revision work for the University of Edinburgh
A compilation of libraries, case studies, resources, and research papers revolving around deep learning/machine learning for audio
Speech Recording Tool
Gutural and scream automatic speech recognition (ASR) system using a fine-tuned version of OpenAI's Whisper model
Baidu TTS(Text-To-Speech), ASR(Automatic-Speech-Recognition) Demo for PC
Timestamped ASR microservice
CMUSphinx Website
[UAI 2024 paper] DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution.
Trained Transformer model for Speech Recognition
WER, MER, WIL of Whisper vs Vosk vs Google transcribators comparator
Different Task Guides for Audio Data
Bangla Automatic Speech Recognition
Supplementary files for the sequential routing framework
Text-To-Speech-Text (TTST) simplifies tech for everyone, turning written text into spoken words. It's a computer system that reads any input aloud, promoting accessibility. English to desired language TTST aids in localizing computer applications, enhancing user understanding.
Material for my lecture on Automatic Speech Recognition
This project aims to learn build Automatic Speech Recognition (ASR) or Voice Recognition using pretrained models Whisper and Wave2Vec from Indonesia AI NLP Bootcamp.
Add a description, image, and links to the automatic-speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the automatic-speech-recognition topic, visit your repo's landing page and select "manage topics."