A novel method for estimating the number of speakers based on generalized eigenvalue-vector decomposition and adaptive wavelet transform by using K-means clustering

Dehghan Firoozabadi, Ali; Irarrázaval, Pablo; Adasme, Pablo; Zabala Blanco, David; Azurdia Meza, César

Author	dc.contributor.author	Dehghan Firoozabadi, Ali
Author	dc.contributor.author	Irarrázaval, Pablo
Author	dc.contributor.author	Adasme, Pablo
Author	dc.contributor.author	Zabala Blanco, David
Author	dc.contributor.author	Azurdia Meza, César
Admission date	dc.date.accessioned	2020-05-04T20:25:49Z
Available date	dc.date.available	2020-05-04T20:25:49Z
Publication date	dc.date.issued	2020
Cita de ítem	dc.identifier.citation	Signal, Image and Video Processing Jan 2020	es_ES
Identifier	dc.identifier.other	10.1007/s11760-020-01634-2
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/174285
Abstract	dc.description.abstract	The aim of this article is estimating the number of simultaneous speakers from the overlapped speech signals. The percentage of correct number of speakers is an important factor for the proposed algorithm. The proposed method in this article is based on spectrum estimation by using the adaptive wavelet transform in combination with generalized eigenvalue-vector decomposition (GEVD) and K-means clustering. Firstly, the speech signals are obtained by a uniform circular array, and each adjacent microphone pairs are considered for the processing. Then, the spectral estimation method is implemented on all microphone signals to select the best part of the speech spectrum. Next, the microphone signals are divided into different subbands by using adaptive wavelet transform. The GEVD algorithm is implemented on each microphone pairs in different subbands and time frames to estimate the room impulse response and time difference of arrival (TDOA). Finally, the K-means clustering with silhouette criteria is used to estimate the number of speakers (K value). The proposed algorithm is implemented on simulated and real data to show the superiority of proposed method in comparison with PENS, Bessel, i-vector PLDA, Hilbert envelope and DNN-based method. The proposed scheme outperforms the other evaluated schemes by 18% in terms of correct estimations in noisy-reverberant conditions for five simultaneous speakers.	es_ES
Patrocinador	dc.description.sponsorship	Comision Nacional de Investigacion Cientifica y Tecnologica (CONICYT) CONICYT FONDECYT 3190147 11180107 11160517	es_ES
Lenguage	dc.language.iso	en	es_ES
Publisher	dc.publisher	Springer	es_ES
Type of license	dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Chile	*
Source	dc.source	Signal, Image and Video Processing	es_ES
Keywords	dc.subject	Adaptive filters	es_ES
Keywords	dc.subject	Eigenvalue-vector decomposition	es_ES
Keywords	dc.subject	Speaker counting	es_ES
Keywords	dc.subject	Spectral estimation	es_ES
Keywords	dc.subject	Wavelet transform	es_ES
Título	dc.title	A novel method for estimating the number of speakers based on generalized eigenvalue-vector decomposition and adaptive wavelet transform by using K-means clustering	es_ES
Document type	dc.type	Artículo de revista	es_ES
dcterms.accessRights	dcterms.accessRights	Acceso a solo metadatos	es_ES
Cataloguer	uchile.catalogador	ivv	es_ES
Indexation	uchile.index	Artículo de publicación ISI
Indexation	uchile.index	Artículo de publicación SCOPUS

Files in this item

Name:: A-novel-method-for-estimating- ...
Size:: 47.06Kb
Format:: PDF

This item appears in the following Collection(s)

Artículos de revistas
Artículos de revistas

Show simple item record