Automated text sentiment analysis for Finnish language using deep learning
Nukarinen, Ville (2018)
Nukarinen, Ville
2018
Tietotekniikka
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2018-06-06
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201805221790
https://urn.fi/URN:NBN:fi:tty-201805221790
Tiivistelmä
The growing amount of information online opens new possibilities for companies to gather valuable data about their products and services. However, such a large amount of information should be collected and analyzed for it to be useful. Gathering and classifying data that is in the millions by hand is obviously not an option. Therefore, information collection and analysis should be automated for the process to be viable.
In this thesis, the focus is on automatic analysis of the data, specifically sentiment analysis of product reviews written in Finnish language. The feasibility of sentiment analysis for Finnish language is researched. This is done using the latest deep learning techniques in the field of machine learning.
The results show that sentiment analysis for Finnish language is feasible and with some improvements to the classifier it could be used for automated sentiment analysis. However, the difference between validation and test set results show that, for example, further regularization could be needed to train a classifier that would generalize better to unseen data.
In this thesis, the focus is on automatic analysis of the data, specifically sentiment analysis of product reviews written in Finnish language. The feasibility of sentiment analysis for Finnish language is researched. This is done using the latest deep learning techniques in the field of machine learning.
The results show that sentiment analysis for Finnish language is feasible and with some improvements to the classifier it could be used for automated sentiment analysis. However, the difference between validation and test set results show that, for example, further regularization could be needed to train a classifier that would generalize better to unseen data.