- Dartmouth College
Thursday, July 23, 2015
16:45 rinfresco; 17:00 inizio seminario
The reliance on plentiful and detailed manual annotations for training is a critical limitation of the current state of the art in object localization and detection. This talk presents self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i.e., without using any ground-truth bounding boxes for training. The key idea is to analyze the change in the recognition scores when artificially masking out different regions of the image. The masking out of a region that contains an object typically causes a significant drop in recognition. This idea is embedded into an agglomerative clustering technique that generates self-taught localization hypotheses. Our experiments on a challenging dataset of 200 classes indicate that our automatically-generated annotations are accurate enough to train object detectors yielding to recognition results remarkably close to those obtained by training on manually-annotated bounding boxes.