Video frame set splitting on reaccuracy attack
ABSTARCT :
This paper introduces a video copy detection system which efficiently matches individual frames and then verifies their spatio-temporal consistency. The approach for matching frames relies on a recent local feature indexing method, which is at the same time robust to significant video transformations and efficient in terms of memory usage and computation time. We match either keyframes or uniformly sampled frames. To further improve the results, a verification step robustly estimates a spatiotemporal model between the query video and the potentially corresponding video segments. Experimental results evaluate the different parameters of our system and measure the trade-off between accuracy and efficiency. We show that our system obtains excellent results for the TRECVID 2008 copy detection task.
EXISTING SYSTEM :
? There are currently no video classification benchmarks that match the scale and variety of existing image datasets because videos are significantly more difficult to collect, annotate and store.
? We present preliminary results with a new Creation Attack, wherein innocuous physical stickers fool a model into detecting nonexistent objects.
? We further introduce a new Creation Attack, wherein physical stickers that humans would ignore as being inconspicuous can cause an object detector into recognizing nonexistent Stop signs.
? We propose and experiment with a new type of Creation attack, that aims at fooling a detector into recognizing adversarial stickers as non-existing objects.
DISADVANTAGE :
? In this paper we present a system which addresses the problem of searching for strongly deformed videos in relatively small datasets.
? In this case, the indexing structure must be stored in part on disk. Others address the problem of detecting repeated subsequences, such as advertising clips, from a video stream. In this case the video quality is usually high and the deformations consistent across sequences.
? To address this problem, we have used the Hamming Embedding method proposed in.
? Problematic are slow-motion or fast-forward (which correspond to a 1D affine transformation), and re-editing a video using cuts (the transformation is non-continuous).
PROPOSED SYSTEM :
• The proposed multiresolution architecture aims to strike a compromise by having two separate streams of processing over two spatial resolutions.
• We propose an effective motion excited sampler to obtain motion-aware noise prior, which we term as sparked prior.
• To investigate the vulnerability and robustness of DNNs, many effective attack methods have been proposed on image models.
• We propose a motion-excited sampler to obtain sparked prior, which leads to more effective gradient estimation for faster adversarial optimization.
• We thus propose the motion-excited sampler to generate a better prior for gradient estimation in a black-box attack setting.
ADVANTAGE :
? The excellent performance of our approach: our run STRICT obtained the best results for all the transformations in terms of the NDCR measure.
? The precision-recall curves are a standard way of measuring the performance of an information retrieval system. We have generated these curves for the most difficult transformations.
? The WGC check is integrated in the inverted file and efficiently exploited for all indexed frames, even for a very large dataset: in this paper, we have indexed up to 2 million frames, represented by 800 million local descriptors.
? Processing and indexing all frames from the query and/or database videos would be too costly and also inefficient due to the temporal redundancy.
|