Finnish ASR with deep transformer models

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Conference article in proceedings
Date
2020
Major/Subject
Mcode
Degree programme
Language
en
Pages
5
3630-3634
Series
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Volume 2020-October, Interspeech
Abstract
Recently, BERT and Transformer-XL based architectures have achieved strong results in a range of NLP applications. In this paper, we explore Transformer architectures-BERT and Transformer-XL-as a language model for a Finnish ASR task with different rescoring schemes. We achieve strong results in both an intrinsic and an extrinsic task with Transformer-XL. Achieving 29% better perplexity and 3% better WER than our previous best LSTM-based approach. We also introduce a novel three-pass decoding scheme which improves the ASR performance by 8%. To the best of our knowledge, this is also the first work (i) to formulate an alpha smoothing framework to use the non-autoregressive BERT language model for an ASR task, and (ii) to explore sub-word units with Transformer-XL for an agglutinative language like Finnish.
Description
| openaire: EC/H2020/780069/EU//MeMAD
Keywords
BERT, Language modeling, Speech recognition, Transformer-XL, Transformers
Other note
Citation
Jain , A , Rouhe , A , Grönroos , S A & Kurimo , M 2020 , Finnish ASR with deep transformer models . in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH . vol. 2020-October , Interspeech , International Speech Communication Association (ISCA) , pp. 3630-3634 , Interspeech , Shanghai , China , 25/10/2020 . https://doi.org/10.21437/Interspeech.2020-1784