Improving Speech Recognition with Jargon Injection

Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The ...

Machine Learning and Speech Recognition Glossary

The model is learned from a set of audio recordings and their corresponding transcripts. It is created by taking audio recordings of speech, and ...

Noisy training for deep neural networks in speech recognition

The idea is simple: by injecting some noises to the input speech data when conducting DNN training, the noise patterns are expected to be ...

How to Improve Recognition of Specific Words — NVIDIA Riva

Word boosting allows you to bias the ASR engine to recognize particular words of interest at request time, by giving them a higher score when ...

Contextualized Speech Recognition: Rethinking Second-Pass ...

Improving end-to-end contextual speech recognition with fine-grained ... Prompting large language models for zero-shot domain adaptation in speech recognition.

Using Text Injection to Improve Recognition of Personal Identifiers in ...

converting the text data to speech via TTS [13, 14, 15, 16, 17]. ... resentation to perform text injection. ... 58M in the text encoder which is ...

Corpus Synthesis for Zero-shot ASR Domain Adaptation using ...

While Automatic Speech Recognition (ASR) systems are widely used in many real-world applications, they often do not generalize well to new domains and need to ...

Improving N-Best Rescoring in Under-Resourced Code-Switched ...

We find in our experiments that, by combining n-gram augmentation with the optimised pretraining strategy, speech recognition errors are reduced ...

Improving speech recognition using bionic wavelet features

The speech feature vector is calculated using the parameters extracted by Bionic wavelet with different central frequencies of Morlet, Daubechies and Bior3.5, ...

Improving N-Best Rescoring in Under-Resourced Code-Switched ...

We find that, even though these language models have not been trained on any of our target languages, they can improve speech recognition performance even in ...

Few-Shot Spoken Language Understanding Via Joint Speech-Text ...

Abstract: Recent work on speech representation models jointly pre-trained with text has demonstrated the potential of improving speech representations by ...

Speech, voice activation, inking, typing, and privacy

You no longer need to turn on the Online Speech recognition setting to use voice typing. You can also choose to contribute voice clips to help improve voice ...

Deep learning: from speech recognition to language and multimodal ...

The results show that the information provided by text significantly improves zero-shot ... : Improving speech recognition in reverberation using a room-aware ...

Improving ASR Models Using LLM-Powered Workflow - Labellerr

Automatic speech recognition (ASR) technology is used in a variety of professional contexts to convert spoken language into written text, ...

Select a transcription model | Cloud Speech-to-Text Documentation

Refer to the speech:recognize API endpoint for complete details. To perform synchronous speech recognition, make a POST request and provide the appropriate ...

Towards Zero-Shot Learning for Automatic Phonemic Transcription

In particular, phoneme transcription tools are use- ful for low-resource language documentation by improving ... way to tackle zero-shot learning of speech ...

Improving speech recognition systems for the morphologically ...

This article presents the research work on improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for ...

Transfer Learning Using L2 Speech to Improve Automatic Speech ...

Our results suggest that including L2 speech in the training data improves dysarthric speech recognition in speaker-dependent and speaker-independent settings.

Speech recognition technology shows significant gains for people ...

This study appears in the Journal of Speech, Language, and Hearing Research. The speech recordings used in the study are freely available to ...

Foreign accent conversion to arbitrary non-native speakers using ...

... improving speech recognition performance (Biadsy et al., 2019). A variety of ... Discussion. We have proposed Accentron, a zero-shot many-to-many speech ...