Applying Machine Learning to Root Cause Analysis in Agile CI/CD Software Testing Environments

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu | Master's thesis
Date
2019-01-28
Department
Major/Subject
Acoustics and Audio Technology
Mcode
ELEC3030
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
79
Series
Abstract
This thesis evaluates machine learning classification and clustering algorithms with the aim of automating the root cause analysis of failed tests in agile software testing environments. The inefficiency of manually categorizing the root causes in terms of time and human resources motivates this work. The development and testing environments of an agile team at Ericsson Finland are used as this work's framework. The author of the thesis extracts relevant features from the raw log data after interviewing the team's testing engineers (human experts). The author puts his initial efforts into clustering the unlabeled data, and despite obtaining qualitative correlations between several clusters and failure root causes, the vagueness in the rest of the clusters leads to the consideration of labeling. The author then carries out a new round of interviews with the testing engineers, which leads to the conceptualization of ground-truth categories for the test failures. With these, the human experts label the dataset accordingly. A collection of artificial neural networks that either classify the data or pre-process it for clustering is then optimized by the author. The best solution comes in the form of a classification multilayer perceptron that correctly assigns the failure category to new examples, on average, 88.9\% of the time. The primary outcome of this thesis comes in the form of a methodology for the extraction of expert knowledge and its adaptation to machine learning techniques for test failure root cause analysis using test log data. The proposed methodology constitutes a prototype or baseline approach towards achieving this objective in a corporate environment.
Description
Supervisor
Jung, Alexander
Thesis advisor
Huuhtanen, Timo
Keywords
root cause analysis, software testing, log data analysis, machine learning, neural networks, automation
Other note
Citation