A perceptually-motivated low-complexity instantaneous linearchannel normalization technique applied to speaker verification
Artículo
Publication date
2015Metadata
Show full item record
Cómo citar
Poblete Ramírez, Víctor
Cómo citar
A perceptually-motivated low-complexity instantaneous linearchannel normalization technique applied to speaker verification
Author
Abstract
This paper proposes a new set of speech features called Locally-Normalized Cepstral Coefficients (LNCC) that are based onSeneff’s Generalized Synchrony Detector (GSD). First, an analysis of the GSD frequency response is provided to show that itgenerates spurious peaks at harmonics of the detected frequency. Then, the GSD frequency response is modeled as a quotient of twofilters centered at the detected frequency. The numerator is a triangular band pass filter centered around a particular frequency similarto the ordinary Mel filters. The denominator term is a filter that responds maximally to frequency components on either side of thenumerator filter. As a result, a local normalization is performed without the spurious peaks of the original GSD. Speaker verificationresults demonstrate that the proposed LNCC features are of low computational complexity and far more effectively compensate forspectral tilt than ordinary MFCC coefficients. LNCC features do not require the computation and storage of a moving average of thefeature values, and they provide relative reductions in Equal Error Rate (EER) as high as 47.7%, 34.0% or 25.8% when comparedwith MFCC, MFCC + CMN, or MFCC + RASTA in one case of variable spectral tilt, respectively.
General note
Artículo de publicación ISI
Patrocinador
CONICYT-ANILLO
ACT 1120
CONICYT-FONDECYT
1100195
EPSRC
EP/I031022/1
Defense Advanced Research Projects Agency (DARPA)
D10PC20024
Identifier
URI: https://repositorio.uchile.cl/handle/2250/132800
DOI: DOI: 10.1016/j.csl.2014.10.006
Quote Item
Computer Speech and Language 31 (2015) 1–27
Collections
The following license files are associated with this item: