Show simple item record

Authordc.contributor.authorDehghan Firoozabadi, Ali 
Authordc.contributor.authorIrarrázaval, Pablo 
Authordc.contributor.authorAdasme, Pablo 
Authordc.contributor.authorZabala Blanco, David 
Authordc.contributor.authorAzurdia Meza, César 
Admission datedc.date.accessioned2020-05-04T20:25:49Z
Available datedc.date.available2020-05-04T20:25:49Z
Publication datedc.date.issued2020
Cita de ítemdc.identifier.citationSignal, Image and Video Processing Jan 2020es_ES
Identifierdc.identifier.other10.1007/s11760-020-01634-2
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/174285
Abstractdc.description.abstractThe aim of this article is estimating the number of simultaneous speakers from the overlapped speech signals. The percentage of correct number of speakers is an important factor for the proposed algorithm. The proposed method in this article is based on spectrum estimation by using the adaptive wavelet transform in combination with generalized eigenvalue-vector decomposition (GEVD) and K-means clustering. Firstly, the speech signals are obtained by a uniform circular array, and each adjacent microphone pairs are considered for the processing. Then, the spectral estimation method is implemented on all microphone signals to select the best part of the speech spectrum. Next, the microphone signals are divided into different subbands by using adaptive wavelet transform. The GEVD algorithm is implemented on each microphone pairs in different subbands and time frames to estimate the room impulse response and time difference of arrival (TDOA). Finally, the K-means clustering with silhouette criteria is used to estimate the number of speakers (K value). The proposed algorithm is implemented on simulated and real data to show the superiority of proposed method in comparison with PENS, Bessel, i-vector PLDA, Hilbert envelope and DNN-based method. The proposed scheme outperforms the other evaluated schemes by 18% in terms of correct estimations in noisy-reverberant conditions for five simultaneous speakers.es_ES
Patrocinadordc.description.sponsorshipComision Nacional de Investigacion Cientifica y Tecnologica (CONICYT) CONICYT FONDECYT 3190147 11180107 11160517es_ES
Lenguagedc.language.isoenes_ES
Publisherdc.publisherSpringeres_ES
Type of licensedc.rightsAttribution-NonCommercial-NoDerivs 3.0 Chile*
Sourcedc.sourceSignal, Image and Video Processinges_ES
Keywordsdc.subjectAdaptive filterses_ES
Keywordsdc.subjectEigenvalue-vector decompositiones_ES
Keywordsdc.subjectSpeaker countinges_ES
Keywordsdc.subjectSpectral estimationes_ES
Keywordsdc.subjectWavelet transformes_ES
Títulodc.titleA novel method for estimating the number of speakers based on generalized eigenvalue-vector decomposition and adaptive wavelet transform by using K-means clusteringes_ES
Document typedc.typeArtículo de revistaes_ES
dcterms.accessRightsdcterms.accessRightsAcceso a solo metadatoses_ES
Catalogueruchile.catalogadorivves_ES
Indexationuchile.indexArtículo de publicación ISI
Indexationuchile.indexArtículo de publicación SCOPUS


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record