O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out

Abstract

Object detection methods trained on a fixed set of known classes struggle to detect objects of unknown classes in the open-world setting. Current fixes involve adding approximate supervision with pseudo-labels corresponding to candidate locations of objects, typically obtained in a class-agnostic manner. While previous approaches mainly rely on the appearance of objects, we find that geometric cues improve unknown recall. Although additional supervision from pseudo-labels helps to detect unknown objects, it also introduces confusion for known classes. We observed a notable decline in the model's performance for detecting known objects in the presence of noisy pseudo-labels. Drawing inspiration from studies on human cognition, we propose to group known classes into superclasses. By identifying similarities between classes within a superclass, we can identify unknown classes through an odd-one-out scoring mechanism. Our experiments on open-world detection benchmarks demonstrate significant improvements in unknown recall, consistently across all tasks. Crucially, we achieve this without compromising known performance, thanks to better partitioning of the feature space with superclasses.

Method Overview

Building on Deformable DETR (blue), we first add supervision for unknowns with geometric pseudo-labels. Following GOOD, we extract pseudo-labels (dashed) from a Region Proposal Network (RPN) trained on surface normal maps (green). This allows us to localize unknown objects based on geometric cues in a class-agnostic manner. Noisy pseudo-labels tend to hurt the known performance of a model. To mitigate this issue, we propose to group queries into superclasses with a superclass head (red). By incorporating the learned prior from superclasses into the scoring function, we achieve the best balance between known and unknown performance.

Quantitative Results

We compare O1O to the state-of-the-art on M-OWOD (top) and S-OWOD (bottom) across 4 different tasks on each. In each column, we highlight the first, second, and third best results.

Qualitative Results

We show the top-10 predictions compared to the previous state-of-the-art on M-OWOD and S-OWOD benchmarks. Compared to other methods, O1O prioritizes known objects with higher confidence and assigns lower scores to unknown predictions, demonstrating better calibration due to superclass supervision.

Incremental Learning

We employ an exemplar fine-tuning strategy for incremental learning. Although some classes are unknown in the first tasks, O1O can still locate them. As the space of known categories expands, O1O learns to label them correctly without forgetting the previously learned classes, showing the effectiveness of our exemplar replay fine-tuning.

BibTeX


        @InProceedings{Yavuz_2024_ACCV,
          author    = {Yavuz, M{\i}sra and G\"uney, Fatma},
          title     = {O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out},
          booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
          month     = {December},
          year      = {2024},
          pages     = {614-629}
        }

O1O: Grouping of Known Classes toIdentify Unknown Objects asOdd-One-Out