Dynamic probabilistic linear discriminant analysis for video classification
Fabris, Alessandro; Nicolaou, Mihalis A.; Kotsia, Irene; Zafeiriou, Stefanos (2017-03-05)
A. Fabris, M. A. Nicolaou, I. Kotsia and S. Zafeiriou, "Dynamic Probabilistic Linear Discriminant Analysis for video classification," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 2781-2785. doi: 10.1109/ICASSP.2017.7952663
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
https://urn.fi/URN:NBN:fi-fe201902276428
Tiivistelmä
Abstract
Component Analysis (CA) comprises of statistical techniques that decompose signals into appropriate latent components, relevant to a task-at-hand (e.g., clustering, segmentation, classification). Recently, an explosion of research in CA has been witnessed, with several novel probabilistic models proposed (e.g., Probabilistic Principal CA, Probabilistic Linear Discriminant Analysis (PLDA), Probabilistic Canonical Correlation Analysis). PLDA is a popular generative probabilistic CA method, that incorporates knowledge regarding class-labels and furthermore introduces class-specific and sample-specific latent spaces. While PLDA has been shown to outperform several state-of-the-art methods, it is nevertheless a static model; any feature-level temporal dependencies that arise in the data are ignored. As has been repeatedly shown, appropriate modelling of temporal dynamics is crucial for the analysis of temporal data (e.g., videos). In this light, we propose the first, to the best of our knowledge, probabilistic LDA formulation that models dynamics, the so-called Dynamic-PLDA (DPLDA). DPLDA is a generative model suitable for video classification and is able to jointly model the label information (e.g., face identity, consistent over videos of the same subject), as well as dynamic variations of each individual video. Experiments on video classification tasks such as face and facial expression recognition show the efficacy of the proposed method.
Kokoelmat
- Avoin saatavuus [31941]