End-to-end named entity recognition for spoken Finnish

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu | Master's thesis
Date
2020-10-20
Department
Major/Subject
Machine Learning, Data Science and Artificial Intelligence
Mcode
SCI3044
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
57+9
Series
Abstract
Named entity recognition is a natural language processing task in which the system tries to find named entities and classify them in predefined categories. The categories can vary, depending on the domain in which they are going to be used but some of the most common include: person, location, organization, date and product. Named entity recognition is an integral part of other large natural language processing tasks, such as information retrieval, text summarization, machine translation, and question answering. Doing named entity recognition is a difficult task due to the lack of annotated data for certain languages or domains. Named entity ambiguity is another challenging aspect that arises when doing named entity recognition. Often times, a word can represent a person, organization, product, or any other category, depending on the context it appears in. Spoken data, which can be the output of a speech recognition system, imposes additional challenges to the named entity recognition system. Named entities are often capitalized and the system learns to rely on capitalization in order to detect the entities, which is neglected in the speech recognition output. The standard way of doing named entity recognition from speech involves a pipeline approach of two systems. First, a speech recognition system transcribes the speech and generates the transcripts, after which a named entity recognition system annotates the transcripts with the named entities. Since the speech recognition system is not perfect and makes errors, those errors are propagated to the named entity recognition system, which is hard to recover from. In this thesis, we present two approaches of doing named entity recognition from Finnish speech in an end-to-and manner, where one system generates the transcripts and the annotations. We will explore the strengths and weaknesses of both approaches and see how they compare to the standard pipeline approach.
Description
Supervisor
Kurimo, Mikko
Thesis advisor
Leinonen, Juho
Keywords
named entity recognition, speech recognition, end-to-end, low-resource
Other note
Citation