Dissertation/Thesis Abstract

Deep Learning for Attribute Inference, Parsing, and Recognition of Face
by Luo, Ping, Ph.D., The Chinese University of Hong Kong (Hong Kong), 2014, 122; 3691916
Abstract (Summary)

Deep learning has been widely and successfully applied to many difficult tasks in computer vision, such as image parsing, object detection, and object recognition, where various deep learning architectures such as deep neural networks, convolutional deep neural networks, and deep belief networks have achieved impressive performance and significantly outperformed state-of-the-art methods. However, the potential of deep learning in face related problems has not be fully explored yet. In this thesis, we fully explore different deep learning methods and proposes new network architectures and learning algorithms on face related applications, such as face parsing, face attribute inference, and face recognition.

For face parsing, we propose a novel face parser, which recasts segmentation of face components as a cross-modality data transformation problem, i.e., transforming an image patch to a label map. Specifically, a face is represented hierarchically by parts, components, and pixel-wise labels. With this representation, this approach first detects faces at both the part- and component-levels, and then computes the pixel-wise label maps. The part-based and component-based detectors are generatively trained with the deep belief network (DBN), and are discriminatively tuned by logistic regression. The segmentators transform the detected face components to label maps, which are obtained by learning a highly nonlinear mapping with the deep autoencoder. The proposed hierarchical face parsing is not only robust to partial occlusions but also provide richer information for face analysis and face synthesis compared with face keypoint detection and face alignment.

For face attribute inference, the proposed approach captures the interdependencies of local regions for each attribute, as well as the high-order correlations between different attributes, which makes it more robust to occlusions and misdetection of face regions. First, we have modeled region interdependencies with a discriminative decision tree, where each node consists of a detector and a classifier trained on a local region. The detector allows us to locate the region, while the classifier determines the presence or absence of an attribute. Second, correlations of attributes and attribute predictors are modeled by organizing all of the decision trees into a large sum-product network (SPN), which is learned by the EM algorithm and yields the most probable explanation (MPE) of the facial attributes in terms of the region’s localization and classification. Experimental results on a large data set with 22, 400 images show the effectiveness of the proposed approach.

For face recognition, this thesis addresses this challenge by proposing a new deep learning framework that can recover the canonical view of face images. It dramatically reduces the intraperson variances, while maintaining the inter-person discriminativeness. Unlike the existing face reconstruction methods that were either evaluated in controlled 2D environment or employed 3D information, our approach directly learns the transformation between face images with a complex set of variations and their canonical views. At the training stage, to avoid the costly process of labeling canonical-view images from the training set by hand, we have devised a new measurement and algorithm to automatically select or synthesize a canonical-view image for each identity. The recovered canonical-view face images are matched by using a facial component-based convolutional neural network. Our approach achieves the best performance on the LFW dataset under the unrestricted protocol. We also demonstrate that the performance of existing methods can be improved if they are applied to our recovered canonical-view face images.

Indexing (document details)
Advisor: Sun, Han Qiu
Commitee:
School: The Chinese University of Hong Kong (Hong Kong)
School Location: Hong Kong
Source: DAI-B 76/08(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: Computer vision, Deep learning, Face alignment, Face recognition
Publication Number: 3691916
ISBN: 9781321669039
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest