Perceptual modeling: factors influecing speech intelligibility in a multitalker environment and applications in speech separation

Kainz, Andrea

doi:10.34726/hss.2017.41204

Record link:

https://doi.org/10.34726/hss.2017.41204
http://hdl.handle.net/20.500.12708/3806

Title:

Perceptual modeling: factors influecing speech intelligibility in a multitalker environment and applications in speech separation

Citation:

Kainz, A. (2017). Perceptual modeling: factors influecing speech intelligibility in a multitalker environment and applications in speech separation [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2017.41204

reposiTUm DOI:

10.34726/hss.2017.41204

CatalogPlus:

AC13727229

Publication Type:

Thesis - Diplomarbeit

Language:

English

Authors:

Kainz, Andrea

Advisor:

Rattay, Frank

Organisational Unit:

E101 - Institut für Analysis und Scientific Computing

Date (published):

2017

Number of Pages:

Keywords:

speech intelligibility; Computational Auditory Scene Analysis; Oldenburger Logatome Corpus; Speech Shaped Noise

Abstract:

The aim of this thesis is the investigation of speech intelligibility in multitalker environments, where the challenge for the listener is to focus on one speaker in the presence of simultaneous interfering talkers or background noise in order to follow the conversation. In general, this is not a difficult task for normal hearing people, but it can be a challenge for people suffering from hearing impairment. Furthermore, it still remains a problem for machines to deal with interfering speech signals. Within this thesis, different speech segregation algorithms and their mathematical and statistical background are presented. There are different approaches of processing interfering speech signals. Motivated by the powerful ability of the auditory system to analyze and segregate incoming sounds, Computational Auditory Scene Analysis (CASA) aims at replicating the different auditory processing stages. Another essential approach in the context of the separation of interfering speech signals which differs from CASA is Blind Source Separation (BSS) which uses results from Statistics and Information Theory to separate a signal mixture into its sources. In the experimental part of the thesis, a speech intelligibility (SI) test was performed which was implemented in MATLAB® (R2015b). The aim was the investigation of factors affecting Speech Intelligibility where the main focus was on analyzing attributes of the masker signals and their influence on speech perception of the target signal. 12 normal hearing listeners participated in the test and the task was to determine the target signals in the presence of different masker signals. The target signals consisted of 14 nonsense-syllables (e.g. 'affa' or 'assa') from the Oldenburger Logatome Corpus (OLLO) spoken by four female persons. The masker signals included sentences from the Oldenburger Satztest (e.g. 'Britta verleiht elf alte Bilder'), the International Speech Test Signal (ISTS) and Speech Shaped Noise (SSN). The test was evaluated using a two-way repeated measures analysis of variance (ANOVA) in SPSS® Statistics (24) including the two within-subject factors "Signal-to-Noise Ratio" (SNR) and "Masker Type". The results showed a significant main effect in both factors (p<0.001) and in further research, ANOVA also demonstrated a significant influence of the factors "Number of Maskers" (p<0.001) and "Spectral Diversity of the Masker" (p<0.001) on speech intelligibility.

Additional information:

Zusammenfassung in deutscher Sprache

License:

In Copyright

Appears in Collections:

Thesis