# Statistical learning theory, classification, and dimensionality reduction

## Description

Linear discriminant analysis (LDA) suffers from the small sample size (SSS) problem. Researchers have proposed several modified versions of LDA to deal with this problem. However, a solid theoretical analysis is missing. We analyze LDA and the SSS problem based on learning theory. Originally derived from Fishers criterion, LDA can also be formulated as a least square (LS) approximation problem. In this way it can be clearly shown that LDA is an ill-posed problem and thus is inherently unstable. In order to transform the ill-posed problem into a well-posed one, a regularization term is necessary. We establish a new approach to discriminant analysis. We call it discriminant learning analysis (DLA). DLA is well-posed and behaves well in the SSS situation Parzen Windows as a nonparametric method has been applied to a variety of density estimation as well as classification tasks. While it converges to the unknown probability density in the asymptotic limit, there is a lack of theoretical analysis on its performance with finite samples. We establish a finite sample error bound for Parzen Windows, and discuss its properties. This analysis provides interesting insight to Parzen Windows as well as the nearest neighbor method from the point of view of learning theory Texture is an important property of surfaces that enables us to distinguish objects. There are several approaches to computing texture features. Of particular interest is multi-channel filtering because of its simplicity. However, the main difficulty associated with such an approach is the resolution of decomposition. Most techniques proposed are optimized with respect to image representation, thus giving no direct guarantee for good feature separation. This dissertation proposes a systematic method for learning optimal filters for texture classification. Since the filter training in the proposed technique is naturally tied to classifier training, the resulting filters are optimized with respect to classification This dissertation also investigates the use of subspace analysis methods for learning low-dimensional representations for classification. We propose a kernel-pooled local discriminant subspace method and compare it against competing techniques: kernel principal component analysis (KPCA) and generalized discriminant analysis (GDA) in classification problems. We evaluate the classification performance of the nearest-neighbor rule with each subspace representation