Transformation and interpolation of language varieties for speech synthesis

Toman, Markus

doi:10.34726/hss.2016.25509

DC Field

Value

Language

dc.contributor.advisor

Rauber, Andreas

dc.contributor.author

Toman, Markus

dc.date.accessioned

2020-06-29T11:07:46Z

dc.date.issued

2016

dc.date.submitted

2016-04

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Toman, M. (2016). <i>Transformation and interpolation of language varieties for speech synthesis</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2016.25509</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2016.25509

dc.identifier.uri

http://hdl.handle.net/20.500.12708/6058

dc.description

Zusammenfassung in deutscher Sprache

dc.description

Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers

dc.description.abstract

This thesis aims to advance the field of speech synthesis by investigating and developing new concepts for acoustic modeling, transformation and interpolation of language varieties (i.e. dialects, sociolects, foreign accents). The goal is to enable systems with speech output to adapt to individual needs and preferences of their users. Transformation of language varieties aims to convert a voice model from one variety to a model in another variety while retaining the voice characteristics. Between multiple voice models of different varieties, interpolation allows to generate intermediate varieties. Both approaches are used to widen the range of speaking styles available to speech output systems. Further, two specific applications are investigated in this thesis: foreign accent reduction and the generation of intelligible fast speech for visually impaired users. All presented methods are evaluated through listening tests and objective measures where appropriate. To conduct these experiments, phone sets and recording scripts for three Austrian German dialects have been created and speech corpora from selected native dialect speakers have been recorded in studio quality. We present a method for unsupervised dialect interpolation and show that listeners are able to correctly perceive the changes in degree of dialect for different settings of the interpolation parameter. We show that transformation of dialects while retaining the original speaker characteristics is possible with the methods presented here. We also compare different approaches for generation of fast synthetic speech. Our experiments show that linearly compressed, natural speech signals are more intelligible than naturally produced fast speech produced by our professional speakers. Overall, this thesis shows how adaptive modeling can be applied to control and modify the language variety of a speech synthesis system.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

Speech Processing

dc.subject

Speech Synthesis

dc.subject

Hidden Markov Model

dc.subject

Language Varieties

dc.subject

Dialects

dc.subject

Voice Conversion

dc.title

Transformation and interpolation of language varieties for speech synthesis

dc.title.alternative

Akustische Modellierung, Transformation und Interpolation von Sprachvarietäten für Sprachsynthese

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2016.25509

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Markus Toman

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

tuw.publication.orgunit

E188 - Institut für Softwaretechnik und Interaktive Systeme

dc.type.qualificationlevel

Doctoral

dc.identifier.libraryid

AC13088333

dc.description.numberOfPages

124

dc.identifier.urn

urn:nbn:at:at-ubtuw:1-1503

dc.thesistype

Dissertation

dc.thesistype

Dissertation

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

item.fulltext

with Fulltext

item.cerifentitytype

Publications

item.mimetype

application/pdf

item.openairecristype

http://purl.org/coar/resource_type/c_db06

item.languageiso639-1

item.openaccessfulltext

Open Access

item.openairetype

doctoral thesis

item.grantfulltext

open

crisitem.author.dept

TU Wien

Appears in Collections:

Thesis

Fulltext (Version of Record (published version))

Adobe PDF

(9.61 MB)

In Copyright

Show simple item record

Page view(s)

232

checked on Nov 23, 2023

Download(s)

121

checked on Nov 23, 2023

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM