Events2Join

Visual Information Matters for ASR Error Correction.


[2303.10160] Visual Information Matters for ASR Error Correction

This paper provides 1) simple yet effective methods, namely gated fusion and image captions as prompts to incorporate visual information to help EC;

Visual Information Matters for ASR Error Correction - IEEE Xplore

VISUAL INFORMATION MATTERS FOR ASR ERROR CORRECTION. Vanya Bannihatti Kumar, Shanbo Cheng, Ningxin Peng, Yuchen Zhang. ByteDance. {vanya.bk, chengshanbo ...

Visual Information Matters for ASR Error Correction - IEEE SigPort

Chicacgo ... SigPort hosts manuscripts, reports, theses, and supporting materials of interests to the broad signal processing community and ...

Visual Information Matters for ASR Error Correction | Request PDF

Error correction has been proven an effective means of refining mistakes produced by the Automatic Speech Recognition (ASR) models, thereby contributing to a ...

Vanya Bannihatti Kumar - Papers With Code

Visual Information Matters for ASR Error Correction ... Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error ...

Large Language Model Based Generative Error Correction

We will provide errorful speaker-attributed transcripts produced by a multi-speaker ASR system. Participants in Task 2 are asked to submit ...

Remember the context! ASR slot error correction through ...

Information and knowledge management · Machine learning · Operations research ... vision identification technologies. The intern/co-op project(s) and the ...

Towards Understanding ASR Error Correction for Medical ...

... One common approach to ASR error detection is through machine translation. Machine translation approaches in error detection commonly interpret error ...

It's Never Too Late: Fusing Acoustic Information into Large...

This paper builds upon the recently proposed “generative error correction” (GEC) paradigm for speech recognition (ASR). The previous GEC work ...

It's Never Too Late: Fusing Acoustic Information into Large ...

for generative error correction (GER) on top of the automatic speech recognition (ASR) output ... visual speech recognition. Authors. Chen ...

Error Correction with Soft Detection for Automatic Speech Recognition

The training, development, and test data for correction models are obtained by using the ASR model to transcribe the corresponding datasets in AISHELL-1 and ...

Significant ASR error detection for conversational voice assistants

Information and knowledge management · Machine learning · Operations research ... vision identification technologies. The intern/co-op project(s) and the ...

Leveraging Large Language Models for Enhanced ASR Error ...

Error correction is a vital element in modern automatic speech recognition (ASR) systems. A significant portion of ASR error correction work is closely ...

Pronunciation Guided Copy And Correction Model For ASR Error ...

information loss. On the other hand, there are studies paid attention to integrating phonetic knowledge into error correction model. [9] ...

Effects of WER on ASR Correction Interfaces for Mobile Text Entry

recognition (ASR) system decodes the acoustic data and provides a ... perform when correcting speech recognition errors on a visual interface. One ...

On Leveraging Visual Modality for Speech Recognition Error ...

errors made by ASR system are functional word errors, which cannot be efficiently corrected with additional high- level contexts such as visual information.

A Multimodal Speech Recognition System With Vision Hotwords

Visual information matters for asr error correction. In ICASSP 2023-2023 IEEE. International Conference on Acoustics, Speech and. Signal Processing (ICASSP) ...

Whispering LLaMA: A Cross-Modal Generative Error Correction ...

its task-specific data (e.g., speech recognition in our instance). Our ... of the TAP-based generative ASR error correction. 4.4 Performance Studies.

Non-Autoregressive Chinese ASR Error Correction with ...

Spellgcn: Incorporating phonological and visual ... Specaugment: A simple data augmentation method for automatic speech recognition.

‪Vanya Bannihatti Kumar‬ - ‪Google Scholar‬

Visual Information Matters for ASR Error Correction. VB Kumar, S Cheng, N Peng, Y Zhang. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and ...