<div class="csl-bib-body">
<div class="csl-entry">Toman, M. (2016). <i>Transformation and interpolation of language varieties for speech synthesis</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2016.25509</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2016.25509
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/6058
-
dc.description
Zusammenfassung in deutscher Sprache
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
This thesis aims to advance the field of speech synthesis by investigating and developing new concepts for acoustic modeling, transformation and interpolation of language varieties (i.e. dialects, sociolects, foreign accents). The goal is to enable systems with speech output to adapt to individual needs and preferences of their users. Transformation of language varieties aims to convert a voice model from one variety to a model in another variety while retaining the voice characteristics. Between multiple voice models of different varieties, interpolation allows to generate intermediate varieties. Both approaches are used to widen the range of speaking styles available to speech output systems. Further, two specific applications are investigated in this thesis: foreign accent reduction and the generation of intelligible fast speech for visually impaired users. All presented methods are evaluated through listening tests and objective measures where appropriate. To conduct these experiments, phone sets and recording scripts for three Austrian German dialects have been created and speech corpora from selected native dialect speakers have been recorded in studio quality. We present a method for unsupervised dialect interpolation and show that listeners are able to correctly perceive the changes in degree of dialect for different settings of the interpolation parameter. We show that transformation of dialects while retaining the original speaker characteristics is possible with the methods presented here. We also compare different approaches for generation of fast synthetic speech. Our experiments show that linearly compressed, natural speech signals are more intelligible than naturally produced fast speech produced by our professional speakers. Overall, this thesis shows how adaptive modeling can be applied to control and modify the language variety of a speech synthesis system.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Speech Processing
en
dc.subject
Speech Synthesis
en
dc.subject
Hidden Markov Model
en
dc.subject
Language Varieties
en
dc.subject
Dialects
en
dc.subject
Voice Conversion
en
dc.title
Transformation and interpolation of language varieties for speech synthesis
en
dc.title.alternative
Akustische Modellierung, Transformation und Interpolation von Sprachvarietäten für Sprachsynthese
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2016.25509
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Markus Toman
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E188 - Institut für Softwaretechnik und Interaktive Systeme