Build failure prediction in continuous integration workflows / von Thomas Rausch
Verfasser / Verfasserin Rausch, Thomas
Begutachter / BegutachterinSchulte, Stefan
ErschienenWien, 2016
Umfangxvi, 128 Seiten : Illustrationen, Diagramme
HochschulschriftTechnische Universität Wien, Diplomarbeit, 2016
Zusammenfassung in deutscher Sprache
Schlagwörter (DE)Empirical Software Engineering / Machine Learning / Predictive Analytics / Continuous Integration / Build Failure Prediction
Schlagwörter (EN)Empirical Software Engineering / Machine Learning / Predictive Analytics / Continuous Integration / Build Failure Prediction
URNurn:nbn:at:at-ubtuw:1-7311 Persistent Identifier (URN)
 Das Werk ist frei verfügbar
Build failure prediction in continuous integration workflows [2.8 mb]
Zusammenfassung (Englisch)

Continuous integration (CI) is a practice where developers integrate their work into the main stream of development frequently. A CI server monitors the source code repository of a project and automatically executes the software build process when new changes are checked in. If a build fails, developers have to identify and fix the cause of the broken build, leading to a delay in the integration process and stalling further development. Large software projects often have long running builds that exacerbate this problem. Despite the widespread use of CI, little is known about the multiplicity of errors that cause builds to fail. Yet, understanding when and why build errors occur is an important step towards improving developer productivity in the CI workflow. By identifying characteristics of development practices that cause build failures, we can predict preliminary results for an integration. This helps developers react to possible problems even before a build is initiated, thereby saving time and resources. In this thesis, we introduce CInsight, a framework for analyzing CI workflows and build failures. We conduct an empirical study on real-world data from 14 open source software projects. Data from source code repositories and build systems are explored to gather qualitative and quantitative evidence about the multiplicity and frequency of CI build errors. Statistical methods are used to examine the relationship between development practices and build failures. Based on the results, we devise a method for CI build failure prediction. Our results show that failing unit-tests and violations of code quality rules are the main causes for build failures. The statistical analyses reveal that the type and amount of previous errors are the strongest predictor for future failures. Our best prediction models yield average recall and precision values of 0.82 and 0.80, respectively. Furthermore, our approach allows to update a prediction during the execution of a build.

Das PDF-Dokument wurde 319 mal heruntergeladen.