<div class="csl-bib-body">
<div class="csl-entry">Bauer, P. R. (2017). <i>Boosting classifications with imbalanced data</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2017.45341</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2017.45341
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/5289
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
Boosting is an ensemble method which uses a “weak” classifier to create a “strong” one, based on the theory of Robert Schapire’s work in 1990 (see Schapire 1990). It appears similar to bagging yet is fundamentally different. This thesis will start with a short introduction followed by a chapter describing the theory and methodology behind boosting. This is followed by a chapter presenting a set of boosting algorithms, applicable to binary, multi-class and regression problems. The major focus of this thesis is to examine the performance of boosting algorithms on imbalanced data sets. The issue with these data sets is that classifiers tend to emphasize the larger classes, which leads to significant class distribution skews. An established general solution to this issue is to apply sampling methods. After introducing these, the simulations chapter demonstrates that boosting algorithms work well with minority sampling in binary classification, whereas majority sampling appears to be preferable in the multi-class problem. However, it will be shown that in the multi-class setting the inbuilt re-weighting of hard to classify problems of the boosting algorithms AdaBoost.M1 and SAMME, is sufficient to handle imbalances in the data set, without any sampling necessary.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Statistics
en
dc.subject
Classification
en
dc.subject
Boosting
en
dc.title
Boosting classifications with imbalanced data
en
dc.title.alternative
Boosting Classifications with Imbalanced Data
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2017.45341
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Philipp Rudolf Bauer
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E105 - Institut für Stochastik und Wirtschaftsmathematik
-
dc.type.qualificationlevel
Diploma
-
dc.identifier.libraryid
AC14500523
-
dc.description.numberOfPages
90
-
dc.identifier.urn
urn:nbn:at:at-ubtuw:1-104275
-
dc.thesistype
Diplomarbeit
de
dc.thesistype
Diploma Thesis
en
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
tuw.advisor.orcid
0000-0002-8014-4682
-
item.fulltext
with Fulltext
-
item.cerifentitytype
Publications
-
item.mimetype
application/pdf
-
item.openairecristype
http://purl.org/coar/resource_type/c_bdcc
-
item.languageiso639-1
en
-
item.openaccessfulltext
Open Access
-
item.openairetype
master thesis
-
item.grantfulltext
open
-
crisitem.author.dept
E105 - Institut für Stochastik und Wirtschaftsmathematik