Progressive Self-Supervised Clustering With Novel Category Discovery
ABSTARCT :
These days, clustering is one of the most classical themes to analyze data structures in machine learning and pattern recognition. Recently, the anchor-based graph has been widely adopted to promote the clustering accuracy of plentiful graph-based clustering techniques. In order to achieve more satisfying clustering performance, we propose a novel clustering approach referred to as the progressive self-supervised clustering method with novel category discovery (PSSCNCD), which consists of three separate procedures specifically. First, we propose a new semisupervised framework with novel category discovery to guide label propagation processing, which is reinforced by the parameter-insensitive anchor-based graph obtained from balanced K-means and hierarchical K-means (BKHK). Second, we design a novel representative point selected strategy based on our semisupervised framework to discover each representative point and endow pseudolabel progressively, where every pseudolabel hypothetically corresponds to a real category in each self-supervised label propagation. Third, when sufficient representative points have been found, the labels of all samples will be finally predicted to obtain terminal clustering results. In addition, the experimental results on several toy examples and benchmark data sets comprehensively demonstrate that our method outperforms other clustering approaches.
EXISTING SYSTEM :
? Most of the existing works so far assume a fully supervised setting, where the instance level segmentation masks are available during training.
? Most of the existing segmentation datasets contain a small set of annotated categories models trained on these categories are not able to generalize well to novel and long-tail objects present in the real world.
? Most existing metric learning methods in computer vision use Euclidean or spherical distances.
? As compared to previous partially or weakly supervised methods that are only trained to detect and segment categories that exist in the ground truth annotations, our method considers a larger set of categories by keeping all the classagnostic mask proposals.
DISADVANTAGE :
? The second problem is that the over-reliance on source supervision makes it challenging to obtain discriminative features on the target.
? We propose to overcome these challenging problems by introducing Domain Adaptive Neighborhood Clustering via Entropy optimization (DANCE).
? The underlying issue at play is that existing work heavily relies on prior knowledge about the category shift.
? The positive impact of our work is to reduce the data gathering effort for data-expensive applications.
? The negative impacts could be to make these systems more accessible to companies, governments or individuals that attempt to use them for criminal activities such as fraud.
PROPOSED SYSTEM :
• We present an effective approach that exploits the inherent hierarchical visual structure of the objects and enables the embedded features of all proposed masks to be easily differentiated through unsupervised clustering.
• We use sampling mechanisms to sample the masks such that the hierarchical structure in the proposed masks is fully exploited and encoded in the hyperbolic space.
• In this paper we proposed an instance segmentation method that discovers long-tail objects through selfsupervised representation learning.
• Leveraging rich relationship and hierarchical structure between objects in the images, we propose self-supervised losses for learning mask embeddings.
ADVANTAGE :
? The main challenge in domain adaptation (DA) is to leverage unlabeled target data to improve the source classifier’s performance while accounting for domain shift.
? The batch normalization layer whitens the feature activations, which contributes to a performance gain.
? This kind of weak alignment matches our goal because strongly aligning feature distributions can harm the performance on non-closed set domain adaptation.
? we show improved performance relative to state-of-the-art, negative transfer could still occur, therefore our approach should not be used in mission-critical applications or to make important decisions without human oversight.
|