We present a novel method for classifying emotions from static facial images. Our approach leverages on the recent success of Convolutional Neural Networks (CNN) on face recognition problems. Unlike the settings often assumed there, far less labeled data is typically available for training emotion classification systems. Our method is therefore designed with the goal of simplifying the problem domain by removing confounding factors from the input images, with an emphasis on image illumination variations. This, in an effort to reduce the amount of data required to effectively train deep CNN models. To this end, we propose novel transformations of image intensities to 3D spaces, designed to be invariant to monotonic photometric transformations. These are applied to CASIA Webface images which are then used to train an ensemble of multiple architecture CNNs on multiple representations. Each model is then fine-tuned with limited emotion labeled training data to obtain final classification models. Our method was tested on the Emotion Recognition in the Wild Challenge (EmotiW 2015), Static Facial Expression Recognition sub-challenge (SFEW) and shown to provide a substantial, 15.36% improvement over baseline results (40% gain in performance).
|Title of host publication||ICMI 2015 - Proceedings of the 2015 ACM International Conference on Multimodal Interaction|
|Publisher||Association for Computing Machinery, Inc|
|Number of pages||8|
|State||Published - 9 Nov 2015|
|Event||ACM International Conference on Multimodal Interaction, ICMI 2015 - Seattle, United States|
Duration: 9 Nov 2015 → 13 Nov 2015
|Name||ICMI 2015 - Proceedings of the 2015 ACM International Conference on Multimodal Interaction|
|Conference||ACM International Conference on Multimodal Interaction, ICMI 2015|
|Period||9/11/15 → 13/11/15|
Bibliographical notePublisher Copyright:
© 2015 ACM.
- Deep learning
- EmotiW 2015 challenge
- Emotion recognition
- Local binary patterns