Phoneme class based feature adaptation for mismatch acoustic modeling and recognition of distant noisy speech

Uluskan, Seçkin; Sangwan, Abhijeet; Hansen, John H. L.

Gelişmiş Arama

Göster/Aç

Tam metin / Full text (2.586Mb)

Erişim

info:eu-repo/semantics/closedAccess

Tarih

2017

Yazar

Uluskan, Seçkin
Sangwan, Abhijeet
Hansen, John H. L.

Üst veri

Tüm öğe kaydını göster

Özet

Distant speech capture in lecture halls and auditoriums offers unique challenges in algorithm development for automatic speech recognition. In this study, a new adaptation strategy for distant noisy speech is created by the means of phoneme classes. Unlike previous approaches which adapt the acoustic model to the features, the proposed phoneme-class based feature adaptation (PCBFA) strategy adapts the distant data features to the present acoustic model which was previously trained on close microphone speech. The essence of PCBFA is to create a transformation strategy which makes the distributions of phoneme-classes of distant noisy speech similar to those of a close talk microphone acoustic model in a multidimensional MFCC space. To achieve this task, phoneme-classes of distant noisy speech are recognized via artificial neural networks. PCBFA is the adaptation of features rather than adaptation of acoustic models. The main idea behind PCBFA is illustrated via conventional Gaussian mixture model-Hidden Markov model (GMM-HMM) although it can be extended to new structures in automatic speech recognition (ASR). The new adapted features together with the new and improved acoustic models produced by PCBFA are shown to outperform those created only by acoustic model adaptations for ASR and keyword spotting. PCBFA offers a new powerful understanding in acoustic-modeling of distant speech.

Kaynak

International Journal of Speech Technology

Cilt

Sayı

Bağlantı

https://dx.doi.org/10.1007/s10772-017-9449-6
https://hdl.handle.net/11421/10415

Koleksiyonlar

Makale Koleksiyonu [791]
Scopus İndeksli Yayınlar Koleksiyonu [8325]
WoS İndeksli Yayınlar Koleksiyonu [7605]