Search space reduction methods for active learning
Description
We seek progress in an area of active learning by framing it as a search problem in alternative spaces. In active learning, samples are added selectively to the training set based on criteria that measure value. This measure is difficult to make explicit but quantifying the effect of a new sample on the search space is a path to achieving such. The direct outcome is quantifying the information content of samples but there is an important secondary outcome---an independent framework for characterizing classification algorithms There are two overarching ideas in our approach. First, the set of consistent learners (in terms of samples thus far labeled) can be characterized via isomorphic, i.e., dual spaces---feature space into which the samples are mapped by an algorithm-dependent function and version space, the set of all consistent classifiers. The second idea is that an additional, consistent sample removes a contiguous convex region from either space. Three research tasks are proposed: (i) development of a version space volume estimator, (ii) development of similar metrics directly on the feature space (the dimensions induced by a search algorithm dependent mapping), and (iii) investigate class-dependent transformations of the feature space within which the developed metrics are more effective Estimating the version space volume is an intense research issue. Similar ends, i.e., the effect of a new sample on the version space, have been approached by estimating the space centroid. Support vector machines have been engaged for centroid estimation as well as a task-specific method known as the analytic center machine. In contrast to centroid approaches, working both within and between duals of the version space, we explore several methods to estimate the volume of the space bounding all consistent solutions. For instance, while difficult to compute directly, an optimal bounding ellipsoid, we have found, provides a reasonable estimate, if not of the volume then of the impact that each sample has on the volume. For the feature space, we propose to use an existing formulation, support vector data description or SVDD. It is inspired in part by observing that the Bayes optimal classifier for a one class classifier (OCC) has an ellipsoid footprint and in part by the fact that the set of feature values that define consistent SVMs are ellipsoids