Search space reduction methods for active learning

Experience the future of the Tulane University Digital Library.
We invite you to visit the new Digital Collections website.

Access the content here.

Description

We seek progress in an area of active learning by framing it as a search problem in alternative spaces. In active learning, samples are added selectively to the training set based on criteria that measure value. This measure is difficult to make explicit but quantifying the effect of a new sample on the search space is a path to achieving such. The direct outcome is quantifying the information content of samples but there is an important secondary outcome---an independent framework for characterizing classification algorithms There are two overarching ideas in our approach. First, the set of consistent learners (in terms of samples thus far labeled) can be characterized via isomorphic, i.e., dual spaces---feature space into which the samples are mapped by an algorithm-dependent function and version space, the set of all consistent classifiers. The second idea is that an additional, consistent sample removes a contiguous convex region from either space. Three research tasks are proposed: (i) development of a version space volume estimator, (ii) development of similar metrics directly on the feature space (the dimensions induced by a search algorithm dependent mapping), and (iii) investigate class-dependent transformations of the feature space within which the developed metrics are more effective Estimating the version space volume is an intense research issue. Similar ends, i.e., the effect of a new sample on the version space, have been approached by estimating the space centroid. Support vector machines have been engaged for centroid estimation as well as a task-specific method known as the analytic center machine. In contrast to centroid approaches, working both within and between duals of the version space, we explore several methods to estimate the volume of the space bounding all consistent solutions. For instance, while difficult to compute directly, an optimal bounding ellipsoid, we have found, provides a reasonable estimate, if not of the volume then of the impact that each sample has on the volume. For the feature space, we propose to use an existing formulation, support vector data description or SVDD. It is inspired in part by observing that the Bayes optimal classifier for a one class classifier (OCC) has an ellipsoid footprint and in part by the fact that the set of feature values that define consistent SVMs are ellipsoids

In collections

Tulane University Theses and Dissertations Archive

Details

Title: Search space reduction methods for active learning
Author: Gokcen, Ibrahim
Advisor: Buckles, Bill P
Date Created: 2004
Degree Date: 2004
Description: We seek progress in an area of active learning by framing it as a search problem in alternative spaces. In active learning, samples are added selectively to the training set based on criteria that measure value. This measure is difficult to make explicit but quantifying the effect of a new sample on the search space is a path to achieving such. The direct outcome is quantifying the information content of samples but there is an important secondary outcome---an independent framework for characterizing classification algorithms There are two overarching ideas in our approach. First, the set of consistent learners (in terms of samples thus far labeled) can be characterized via isomorphic, i.e., dual spaces---feature space into which the samples are mapped by an algorithm-dependent function and version space, the set of all consistent classifiers. The second idea is that an additional, consistent sample removes a contiguous convex region from either space. Three research tasks are proposed: (i) development of a version space volume estimator, (ii) development of similar metrics directly on the feature space (the dimensions induced by a search algorithm dependent mapping), and (iii) investigate class-dependent transformations of the feature space within which the developed metrics are more effective Estimating the version space volume is an intense research issue. Similar ends, i.e., the effect of a new sample on the version space, have been approached by estimating the space centroid. Support vector machines have been engaged for centroid estimation as well as a task-specific method known as the analytic center machine. In contrast to centroid approaches, working both within and between duals of the version space, we explore several methods to estimate the volume of the space bounding all consistent solutions. For instance, while difficult to compute directly, an optimal bounding ellipsoid, we have found, provides a reasonable estimate, if not of the volume then of the impact that each sample has on the volume. For the feature space, we propose to use an existing formulation, support vector data description or SVDD. It is inspired in part by observing that the Bayes optimal classifier for a one class classifier (OCC) has an ellipsoid footprint and in part by the fact that the set of feature values that define consistent SVMs are ellipsoids
Topical Subject(s): Tulane University
Computer Science
Ph.D
Publisher: Tulane University
Language: eng
Source: Source: 105 p., Dissertation Abstracts International, Volume: 65-07, Section: B, page: 3537
Use & Reproductions: Access requires a license to the Dissertations and Theses (ProQuest) database.
Rights: Copyright is retained in accordance with U.S. copyright laws.
Contact Information: specialcollections@tulane.edu

Search space reduction methods for active learning

Experience the future of the Tulane University Digital Library.We invite you to visit the new Digital Collections website.

Description

In collections

Experience the future of the Tulane University Digital Library.
We invite you to visit the new Digital Collections website.