3D multiple sound source localization by proposed cuboids nested microphone array in combination with adaptive Wavelet-based subband GEVD

Dehghan Firoozabadi, Ali; Irarrázaval, Pablo; Adasme, Pablo; Zabala Blanco, David; Palacios Játiva, Pablo Geovanny; Azurdia Meza, César

Author	dc.contributor.author	Dehghan Firoozabadi, Ali
Author	dc.contributor.author	Irarrázaval, Pablo
Author	dc.contributor.author	Adasme, Pablo
Author	dc.contributor.author	Zabala Blanco, David
Author	dc.contributor.author	Palacios Játiva, Pablo Geovanny
Author	dc.contributor.author	Azurdia Meza, César
Admission date	dc.date.accessioned	2020-09-21T14:52:10Z
Available date	dc.date.available	2020-09-21T14:52:10Z
Publication date	dc.date.issued	2020
Cita de ítem	dc.identifier.citation	Electronics 2020, 9, 867	es_ES
Identifier	dc.identifier.other	10.3390/electronics9050867
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/176785
Abstract	dc.description.abstract	Sound source localization is one of the applicable areas in speech signal processing. The main challenge appears when the aim is a simultaneous multiple sound source localization from overlapped speech signals with an unknown number of speakers. Therefore, a method able to estimate the number of speakers, along with the speaker's location, and with high accuracy is required in real-time conditions. The spatial aliasing is an undesirable effect of the use of microphone arrays, which decreases the accuracy of localization algorithms in noisy and reverberant conditions. In this article, a cuboids nested microphone array (CuNMA) is first proposed for eliminating the spatial aliasing. The CuNMA is designed to receive the speech signal of all speakers in different directions. In addition, the inter-microphone distance is adjusted for considering enough microphone pairs for each subarray, which prepares appropriate information for 3D sound source localization. Subsequently, a speech spectral estimation method is considered for evaluating the speech spectrum components. The suitable spectrum components are selected and the undesirable components are denied in the localization process. The speech information is different in frequency bands. Therefore, the adaptive wavelet transform is used for subband processing in the proposed algorithm. The generalized eigenvalue decomposition (GEVD) method is implemented in sub-bands on all nested microphone pairs, and the probability density function (PDF) is calculated for estimating the direction of arrival (DOA) in different sub-bands and continuing frames. The proper PDFs are selected by thresholding on the standard deviation (SD) of the estimated DOAs and the rest are eliminated. This process is repeated on time frames to extract the best DOAs. Finally, K-means clustering and silhouette criteria are considered for DOAs classification in order to estimate the number of clusters (speakers) and the related DOAs. All DOAs in each cluster are intersected for estimating the position of the 3D speakers. The closest point to all DOA planes is selected as a speaker position. The proposed method is compared with a hierarchical grid (HiGRID), perpendicular cross-spectra fusion (PCSF), time-frequency wise spatial spectrum clustering (TF-wise SSC), and spectral source model-deep neural network (SSM-DNN) algorithms based on the accuracy and computational complexity of real and simulated data in noisy and reverberant conditions. The results show the superiority of the proposed method in comparison with other previous works.	es_ES
Patrocinador	dc.description.sponsorship	Comisión Nacional de Investigación Científica y Tecnológica (CONICYT) CONICYT FONDECYT 3190147 11180107 ANID PFCHA/Beca de Doctorado Nacional/2019 21190489	es_ES
Lenguage	dc.language.iso	en	es_ES
Publisher	dc.publisher	MDPI	es_ES
Type of license	dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Chile	*
Link to License	dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/	*
Source	dc.source	Electronics	es_ES
Keywords	dc.subject	Sound source localization	es_ES
Keywords	dc.subject	Nested microphone array	es_ES
Keywords	dc.subject	Spectral estimation	es_ES
Keywords	dc.subject	Wavelet transform	es_ES
Keywords	dc.subject	Subband processing	es_ES
Keywords	dc.subject	Clustering	es_ES
Título	dc.title	3D multiple sound source localization by proposed cuboids nested microphone array in combination with adaptive Wavelet-based subband GEVD	es_ES
Document type	dc.type	Artículo de revista	es_ES
dcterms.accessRights	dcterms.accessRights	Acceso Abierto
Cataloguer	uchile.catalogador	ctc	es_ES
Indexation	uchile.index	Artículo de publicación ISI
Indexation	uchile.index	Artículo de publicación SCOPUS

Files in this item

Name:: 3D-Multiple-Sound-Source.pdf
Size:: 8.416Mb
Format:: PDF

This item appears in the following Collection(s)

Artículos de revistas
Artículos de revistas

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile