Events2Join

[2303.10160] Visual Information Matters for ASR Error Correction


[2303.10160] Visual Information Matters for ASR Error Correction

This paper provides 1) simple yet effective methods, namely gated fusion and image captions as prompts to incorporate visual information to help EC;

(PDF) Visual Information Matters for ASR Error Correction

PDF | Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques ...

Visual Information Matters for ASR Error Correction - GoTriple

ftarxivpreprints:oai:arXiv.org:2303.10160. > Where these data come from. Visual Information Matters for ASR Error Correction. Authors. Kumar, Vanya Bannihatti ...

Visual Information Matters for ASR Error Correction. - dblp

Vanya Bannihatti Kumar, Shanbo Cheng, Ningxin Peng, Yuchen Zhang: Visual Information Matters for ASR Error Correction. CoRR abs/2303.10160 (2023).

Visual Information Matters for ASR Error Correction - IEEE SigPort

Visual Information Matters for ASR Error Correction. Citation Author(s):: Submitted by: Vanya Bannihatt... Last updated: 29 May 2023 - 11: ...

Visual Information Matters for ASR Error Correction | Request PDF

Request PDF | On Jun 4, 2023, Vanya Bannihatti Kumar and others published Visual Information Matters for ASR Error Correction | Find, read and cite all the ...

‪Ningxin Peng‬ - ‪Google Scholar‬

Visual Information Matters for ASR Error Correction. V Bannihatti Kumar, S Cheng, N Peng, Y Zhang. arXiv e-prints, arXiv: 2303.10160, 2023. 2023 ...

Computer Science Mar 2023 - arXiv

[8181] arXiv:2303.10160 (cross-list from eess.AS) [pdf, other]. Title: Visual Information Matters for ASR Error Correction. Vanya Bannihatti Kumar, Shanbo ...

Visual Information Matters for ASR Error Correction - Paper Reading

Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely ...

It's Never Too Late: Fusing Acoustic Information into Large ...

... error correction (GER) on top of the automatic speech recognition (ASR) output. Specifically, an LLM is utilized to carry out a direct mapping from the N ...

Investigating ASR Error Correction with Large Language Model and ...

This paper investigates using pre-trained large language models (LLMs) to improve multilingual automatic speech recognition (ASR) outputs.

ASR Error Correction with Augmented Transformer for Entity Retrieval

Without loss of generality, we use phonetic information as an example, which ties to our application of. ASR error correction. The model architecture is ...

ASR Error Correction with Constrained Decoding on Operation ...

Wp is the trainable parameters. Then, a standard transformer decoder [20] is employed to fuse all relevant information: hk t+ ...