A convolutional neural network approach for acoustic scene classification

Valenti, Michele; Squartini, Stefano; Diment, Aleksandr; Parascandolo, Giambattista; Virtanen, Tuomas

A convolutional neural network approach for acoustic scene classification

Valenti, Michele; Squartini, Stefano; Diment, Aleksandr; Parascandolo, Giambattista; Virtanen, Tuomas (2017-06-30)

Avaa tiedosto

ijcnn_paper_valenti_extended.pdf (1.400Mt)

Lataukset:

Valenti, Michele

Squartini, Stefano

Diment, Aleksandr

Parascandolo, Giambattista

Virtanen, Tuomas

IEEE

30.06.2017

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited

doi:10.1109/IJCNN.2017.7966035

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201802141222

Kuvaus

Peer reviewed

Tiivistelmä

This paper presents a novel application of convolutional neural networks (CNNs) for the task of acoustic scene classification (ASC). We here propose the use of a CNN trained to classify short sequences of audio, represented by their log-mel spectrogram. We also introduce a training method that can be used under particular circumstances in order to make full use of small datasets. The proposed system is tested and evaluated on three different ASC datasets and compared to other state-of-the-art systems which competed in the 'Detection and Classification of Acoustic Scenes and Events' (DCASE) challenges held in 20161 and 2013. The best accuracy scores obtained by our system on the DCASE 2016 datasets are 79.0% (development) and 86.2% (evaluation), which constitute a 6.4% and 9% improvements with respect to the baseline system. Finally, when tested on the DCASE 2013 evaluation dataset, the proposed system manages to reach a 77.0% accuracy, improving by 1% the challenge winner's score.

Kokoelmat

TUNICRIS-julkaisut [16944]