Text Mining for Global Reporting Initiative (GRI) Standards: Study of Nordic listed companies.
Gutierrez, Marcelo (2020)
Gutierrez, Marcelo
2020
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-2020060417028
https://urn.fi/URN:NBN:fi:amk-2020060417028
Tiivistelmä
This thesis investigates using text mining methods, if Corporate Social Responsibility reports have been published following the guidelines of the Global Reporting Initiative Standards. To achieve this goal, several text mining techniques were implemented such as Latent Dirichlet Allocation (LDA), Transformation into inverse frequency (TF-idf), FastText, Global Vectors (gloVE), Latent Semantic Index (LSA), wor2vec and doc2vec. These results are analysed from an Unsupervised Learning perspective. We extract string, corpus and hybrid semantic similarities and we evaluated the models trough the intrinsic assessment methodology. Index matching was develop to complement the semantic valuation. The final results show us that LSA and gloVE as the best option for our study of Nordic listed companies.