Lo, Tsz-Wai Rachel
Feature extraction for range image interpretation using local topology statistics.
PhD thesis, University of Glasgow.
Full text available as:
This thesis presents an approach for interpreting range images of known subject matter, such as the human face, based on the extraction and matching of local features from the images. In recent years, approaches to interpret two-dimensional (2D) images based on local feature extraction have advanced greatly, for example, systems such as Scale Invariant Feature Transform (SIFT) can detect and describe the local features in the 2D images effectively. With the aid of rapidly advancing three-dimensional (3D) imaging technology, in particular, the advent of commercially available surface scanning systems based on photogrammetry, image representation has been able to extend into the third dimension. Moreover, range images confer a number of advantages over conventional 2D images, for instance, the properties of being invariant to lighting, pose and viewpoint changes. As a result, an attempt has been made in this work to establish how best to represent the local range surface with a feature descriptor, thereby developing a matching system that takes advantages of the third dimension present in the range images and casting this in the framework of an existing scale and rotational invariance recognition technology: SIFT.
By exploring the statistical representations of the local variation, it is possible to represent and match range images of human faces. This can be achieved by extracting unique mathematical keys known as feature descriptors, from the various automatically generated stable keypoint locations of the range images, thereby capturing the local information of the distributions of the mixes of surface types and their orientations simultaneously. Keypoints are generated through scale-space approach, where the (x,y) location and the appropriate scale (sigma) are detected. In order to achieve invariance to in-plane viewpoint rotational changes, a consistent canonical orientation is assigned to each keypoint and the sampling patch is rotated to this canonical orientation. The mixes of surface types, derived using the shape index, and the image gradient orientations are extracted from each sampling patch by placing nine overlapping Gaussian sub-regions over the measurement aperture. Each of the nine regions is overlapped by one standard deviation in order to minimise the occurrence of spatial aliasing during the sampling stages and to provide a better continuity within the descriptor.
Moreover, surface normals can be computed from each of the keypoint location, allowing the local 3D pose to be estimated and corrected within the feature descriptors since the orientations in which the images were captured are unknown a priori. As a result, the formulated feature descriptors have strong discriminative power and are stable to rotational changes.
Actions (login required)