pSTarC: Pseudo Source Guided Target Clustering for Fully Test-Time Adaptation

1Indian Institute of Science, Bengaluru, India    2Indian Institute of Science Education and Research, Pune
WACV, 2024

Abstract

Test Time Adaptation (TTA) is a pivotal concept in machine learning, enabling models to perform well in real-world scenarios where test data distribution differs from training. In this work, we propose a novel approach called pseudo Source guided Target Clustering (pSTarC) addressing the relatively unexplored area of TTA under real-world domain shifts. This method draws inspiration from target clustering techniques and exploits the source classifier for generating pseudo-source samples. The test samples are strategically aligned with these pseudo-source samples, facilitating their clustering and thereby enhancing TTA performance. pSTarC operates solely within the fully test-time adaptation protocol, removing the need for actual source data. Experimental validation on VisDA, Office-Home, DomainNet-126, and CIFAR-100C verifies pSTarC's effectiveness, with significant improvements in prediction accuracy and efficient computational requirements. We also demonstrate the universality of pSTarC by showing its effectiveness for the continuous TTA framework.

Problem Setting

In Test-Time Adaptation (TTA), a model trained on source domain \(\mathcal{D}_s\) must adapt to a target domain \(\mathcal{D}_t\) at inference time, without access to source data or ground truth labels. In the more challenging Continual TTA (CTTA) setting, the target distribution shifts continuously over time: \(P_t^{(1)} \neq P_t^{(2)} \neq \ldots\).

Existing TTA methods typically require the source data to compute statistics or rely on batch normalization updates that become unstable at small batch sizes. pSTarC addresses these limitations by operating in the fully test-time protocol: no source data, no labels, no offline statistics — adapting purely from the test stream.

Key challenges addressed:

  • Source-free setting: No access to source data or source model internals beyond the final classifier.
  • Real-world domain shifts: Benchmarks include domain shifts across artistic styles, object categories, and image corruptions.
  • Memory efficiency: pSTarC maintains only a compact pseudo-source feature buffer (0.03M), vs. 4.67M for AdaContrast.

Proposed Method: pSTarC

pSTarC has two components: (1) a pseudo-source feature generation step that synthesizes source-like features from the frozen classifier, and (2) a target clustering step that aligns test samples to these pseudo-source features.

pSTarC framework overview
Overview of the pSTarC framework for test-time adaptation.

1. Pseudo Source Feature Generation

We randomly initialize a feature bank \(\mathbf{f}\) and iteratively optimize it, keeping the classifier \(h\) fixed, to minimize entropy while maximizing class diversity:

\[ \mathbf{f}^{*} = \arg\min_{\mathbf{f}}\; \mathcal{L}_{ent}(\mathbf{f}; h) + \beta\, \mathcal{L}_{div}(\mathbf{f}; h) \]

where

\[ \mathcal{L}_{ent}(\mathbf{f}; h) = -\frac{1}{N}\sum_{i=1}^N\sum_{k=1}^C p_k \log p_k \qquad \mathcal{L}_{div}(\mathbf{f}; h) = \sum_{k=1}^C \hat{p}_k \log \hat{p}_k \]

The resulting features act as pseudo-source prototypes: low-entropy, class-discriminative feature representations that stand in for unavailable source data.

2. Pseudo Source Guided Target Clustering

At test time, each test sample \(x_k\) with prediction \(p_k\) is adapted via three terms: (i) \(L_{aug}\): consistency between the prediction and its strong augmentation \(\tilde{p}_k\); (ii) \(L_{attr}\): attraction of low-entropy samples toward the nearest pseudo-source features \(\mathbf{p}^+\); (iii) \(L_{disp}\): dispersion of predictions across the batch to prevent collapse.

\[ \mathcal{L}_{\textrm{pSTarC}}(x_k) = \underbrace{-p_k^T\tilde{p}_k}_{L_{aug}} - \underbrace{\sum_{p_j^+\in\mathbf{p}^+} p_k^T p_j^+}_{L_{attr}} + \underbrace{\lambda\sum_{x_j\in\mathbf{x}_t} p_k^T p_j}_{L_{disp}} \]

Results

TTA: Comparison with State-of-the-Art

Method VisDA Office-Home DomainNet-126 CIFAR-100C
Source 43.859.455.253.6
BN-Adapt 66.058.057.564.6
TENT 70.758.258.968.8
AdaContrast 78.760.262.665.9
pSTarC 81.963.563.769.5

Table 1: Average accuracy (%) on TTA benchmarks. pSTarC achieves the best accuracy on Office-Home, DomainNet-126, and CIFAR-100C without access to source data.

CTTA: Comparison with State-of-the-Art

Method CIFAR-100C DomainNet-126
BN-Adapt 64.658.1
TENT 39.159.7
CoTTA 67.560.2
AdaContrast 66.665.1
RMT 69.665.3
pSTarC 67.765.5

Table 2: Average accuracy (%) on CTTA benchmarks. TENT diverges on CIFAR-100C; pSTarC remains stable and outperforms all methods on DomainNet-126.

Ablation Study

\(L_{aug}\) \(L_{attr}\) \(L_{disp}\) VisDA DomainNet-126
68.858.8
78.259.7
80.063.0
81.963.7

Table 3: Ablation of loss components on VisDA and DomainNet-126. Each term contributes, with \(L_{disp}\) providing the largest gain on DomainNet-126.

Effect of Batch Size

Method Batch=8 Batch=16 Batch=32 Batch=64 Batch=128
TENT 38.855.458.659.158.9
AdaContrast 50.157.960.862.462.4
pSTarC 54.159.261.363.863.7

Table 4: Accuracy (%) on DomainNet-126 across batch sizes. pSTarC consistently outperforms other methods and degrades more gracefully at small batch sizes.

BibTeX

@inproceedings{sreenivas2024pstarc,
  author    = {Sreenivas, Manogna and Chakrabarty, Goirik and Biswas, Soma},
  title     = {pSTarC: Pseudo Source Guided Target Clustering for Fully Test-Time Adaptation},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2024}
}