Robust log-ratio methods for classifying high-dimensional metabolomics data

Walach, Jan

doi:10.34726/hss.2019.44448

DC Field

Value

Language

dc.contributor.advisor

Filzmoser, Peter

dc.contributor.author

Walach, Jan

dc.date.accessioned

2020-06-29T06:21:36Z

dc.date.issued

2019

dc.date.submitted

2019-03

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Walach, J. (2019). <i>Robust log-ratio methods for classifying high-dimensional metabolomics data</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2019.44448</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2019.44448

dc.identifier.uri

http://hdl.handle.net/20.500.12708/5340

dc.description

Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers

dc.description.abstract

The development of statistical methods which are able to deal with high-dimensional data belongs to the major research activities in statistics. In many fields (e.g. chemometrics, genomics, metabolomics) it is easy to measure and store data by using advanced modern techniques. Thus, there are also numerous real-world applications justifying these developments. One possible way how to deal with such data comes from the log-ratio point of view. There is whole branch of statistics devoted to log-ratios -- Compositional Data Analysis. Compositional data represent a special type of multivariate data which describe parts of a whole. In this context only relative information is important. Because of these special features of compositional data, the application of standard statistical methods could lead to invalid conclusions. The primary aim of the thesis is to introduce procedures for analysing high-dimensional data which originate from different groups. The main focus is set on applications in the field of metabolomics, where the different data groups consist of observations related to different diseases. The new methods should not only allow to differentiate between the groups, but they should also enable feature selection: only those features (variables), which allow to discriminate between the different groups, should be identified. An important request for these methods is their robustness against outlying observations, which is a common situation in real data. Another interest of the thesis is the investigation of outliers in the data. We focus on both observational outliers and on so-called cell outliers. The former refers to the situation when an observation deviates from the majority of a group in possibly all variables, while in the latter case for a certain observation only the values in some variables (cells) are deviating. This will contribute to gain a better insight into the data structure.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

compositional data

dc.subject

cell-wise outliers

dc.subject

feature selection

dc.title

Robust log-ratio methods for classifying high-dimensional metabolomics data

dc.title.alternative

Robuste log-ratio Methoden zur Klassifikation von hochdimensionalen Daten aus der Metabolomik

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2019.44448

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Jan Walach

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

tuw.publication.orgunit

E105 - Institut für Stochastik und Wirtschaftsmathematik

dc.type.qualificationlevel

Doctoral

dc.identifier.libraryid

AC15327937

dc.description.numberOfPages

116

dc.identifier.urn

urn:nbn:at:at-ubtuw:1-122544

dc.thesistype

Dissertation

dc.thesistype

Dissertation

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

tuw.advisor.orcid

0000-0002-8014-4682

item.fulltext

with Fulltext

item.cerifentitytype

Publications

item.mimetype

application/pdf

item.openairecristype

http://purl.org/coar/resource_type/c_db06

item.languageiso639-1

item.openaccessfulltext

Open Access

item.openairetype

doctoral thesis

item.grantfulltext

open

crisitem.author.dept

E105-06 - Forschungsbereich Computational Statistics

crisitem.author.parentorg

E105 - Institut für Stochastik und Wirtschaftsmathematik

Appears in Collections:

Thesis

Fulltext (Version of Record (published version))

Adobe PDF

(1.76 MB)

In Copyright

Show simple item record

Page view(s)

176

checked on Dec 1, 2023

Download(s)

checked on Dec 1, 2023

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM