Cold-Start Active Sampling via ?-Tube
ABSTARCT :
Active learning (AL) improves the generalization performance for the current classification hypothesis by querying labels from a pool of unlabeled data. The sampling process is typically assessed by an informative, representative, or diverse evaluation policy. However, the policy, which needs an initial labeled set to start, may degenerate its performance in a cold-start hypothesis. In this article, we first show that typical AL sampling can be equivalently formulated as geometric sampling over minimum enclosing balls1 (MEBs) of clusters. Following the ?-tube structure in geometric clustering, we then divide one MEB covering a cluster into two parts: 1) a ?-tube and 2) a ?-ball. By estimating the error disagreement between sampling in MEB and ?-ball, our theoretical insight reveals that ?-tube can effectively measure the disagreement of hypotheses in original space over MEB and sampling space over ?-ball. To tighten our insight, we present generalization analysis, and the results show that sampling in ?-tube can derive higher probability bound to achieve a nearly zero generalization error. With these analyses, we finally apply the informative sampling policy of AL over ?-tube to present a tube AL (TAL) algorithm against the cold-start sampling issue. As a result, the dependency between the querying process and the evaluation policy of active sampling can be alleviated. Experimental results show that by using the ?-tube structure to deal with cold-start sampling, TAL achieves the superior performance than standard AL evaluation baselines by presenting substantial accuracy improvements. Image edge recognition extends our theoretical results.
EXISTING SYSTEM :
? In recommendation systems, for example, ratings data for existing users can inform a strategy that efficiently elicits preferences for new users who lack prior rating data, thus bootstrapping the system quickly out of the cold-start setting.
? We evaluate the model on “active” variants of existing oneshot learning tasks for Omniglot, and show that it can learn efficient label querying strategies.
? Active learning can be useful when the cost incurred for labeling an item may be traded for lower prediction error, and where the model must be data efficient.
? We expedite the training process by allowing our model to observe and mimic a strong selection policy with oracle knowledge of the labels.
DISADVANTAGE :
? These issues make distance comparison between gradient embeddings less meaningful and raises costs to compute those distances.
? Once the cold-start issue subsides, uncertaintybased methods can be employed to further query the most confusing examples for the model to learn.
? Thus, warm-start methods may suffer from problems with model uncertainty or inference.
? Both methods strive to optimize for uncertainty and diversity, which alleviates problems with class imbalance.
? While random sampling chooses many sentences of this form, ALPS seems to avoid this problem.
PROPOSED SYSTEM :
• We demonstrate empirically that our proposed model learns effective active learning algorithms in an end-to-end fashion.
• Various heuristics have been proposed to guide the selection of which examples to label during active learning.
• Our proposed model instead falls into the class of poolbased active learners, i.e. it has access to a static collection of unlabeled data and selects both the items for which to observe labels, and the order in which to observe them.
• We propose moving away from engineered selection heuristics towards learning active learning algorithms end-to-end via metalearning.
ADVANTAGE :
? The pre-training loss can find examples that surprise the model and should be labeled for efficient fine-tuning.
? While uncertainty sampling efficiently searches the hypothesis space by finding difficult examples to label, diversity sampling exploits heterogeneity in the feature space.
? BADGE combines uncertainty and diversity sampling to profit from advantages of both methods but also brings the downsides of both: reliance on warm-starting and computational inefficiency.
? The algorithm is the same as ALPS except that BERT embeddings are used.
? ALPS outperforms AL baselines in accuracy and algorithmic efficiency. The success of ALPS highlights the importance of self-supervision for cold-start AL.
|