Application of knowledge discovery in databases : automating manual tasks
Habteselassie, Biruk (2016)
Habteselassie, Biruk
2016
MDP in Software Development
Informaatiotieteiden yksikkö - School of Information Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2016-12-15
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:uta-201612222898
https://urn.fi/URN:NBN:fi:uta-201612222898
Tiivistelmä
Businesses have large data stored in databases and data warehouses that is beyond the scope of traditional analysis methods. Knowledge discovery in databases (KDD) has been applied to get insight from this large business data. In this study, I investigated the application of KDD to automate two manual tasks in a Finnish company that pro-vides financial automation solutions. The objective of the study was to develop mod-els from historical data and use the models to handle future transactions to minimize or omit the manual tasks.
Historical data about the manual tasks was extracted from the database. The data was prepared and three machine learning methods were used to develop classification models from the data. The three machine learning methods used are decision tree, Na-ïve Bayes, and k-nearest neighbor. The developed models were evaluated on test data.
The models were evaluated based on accuracy and prediction rate. Overall, decision tree had the highest accuracy while k-nearest neighbor has the highest prediction rate. However, there were significant differences in performance across datasets.
Overall, the results show that there are patterns in the data that can be used to auto-mate the manual tasks. Due to time constraints data preparation was not done thoroughly. In future iterations, a better data preparation could result in a better result. Moreover, further study to determine the effect of type of transactions on modeling is required. It can be concluded that knowledge discovery methods and tools can be used to automate the manual tasks
Historical data about the manual tasks was extracted from the database. The data was prepared and three machine learning methods were used to develop classification models from the data. The three machine learning methods used are decision tree, Na-ïve Bayes, and k-nearest neighbor. The developed models were evaluated on test data.
The models were evaluated based on accuracy and prediction rate. Overall, decision tree had the highest accuracy while k-nearest neighbor has the highest prediction rate. However, there were significant differences in performance across datasets.
Overall, the results show that there are patterns in the data that can be used to auto-mate the manual tasks. Due to time constraints data preparation was not done thoroughly. In future iterations, a better data preparation could result in a better result. Moreover, further study to determine the effect of type of transactions on modeling is required. It can be concluded that knowledge discovery methods and tools can be used to automate the manual tasks