Finnish ASR with deep transformer models
Loading...
Journal Title
Journal ISSN
Volume Title
Conference article in proceedings
This publication is imported from Aalto University research portal.
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
View publication in the Research portal
View/Open full text file from the Research portal
Other link related to publication
Date
2020
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
3630-3634
3630-3634
Series
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Volume 2020-October, Interspeech
Abstract
Recently, BERT and Transformer-XL based architectures have achieved strong results in a range of NLP applications. In this paper, we explore Transformer architectures-BERT and Transformer-XL-as a language model for a Finnish ASR task with different rescoring schemes. We achieve strong results in both an intrinsic and an extrinsic task with Transformer-XL. Achieving 29% better perplexity and 3% better WER than our previous best LSTM-based approach. We also introduce a novel three-pass decoding scheme which improves the ASR performance by 8%. To the best of our knowledge, this is also the first work (i) to formulate an alpha smoothing framework to use the non-autoregressive BERT language model for an ASR task, and (ii) to explore sub-word units with Transformer-XL for an agglutinative language like Finnish.Description
| openaire: EC/H2020/780069/EU//MeMAD
Keywords
BERT, Language modeling, Speech recognition, Transformer-XL, Transformers
Other note
Citation
Jain , A , Rouhe , A , Grönroos , S A & Kurimo , M 2020 , Finnish ASR with deep transformer models . in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH . vol. 2020-October , Interspeech , International Speech Communication Association (ISCA) , pp. 3630-3634 , Interspeech , Shanghai , China , 25/10/2020 . https://doi.org/10.21437/Interspeech.2020-1784