pSTarC

Abstract

Test Time Adaptation (TTA) is a pivotal concept in machine learning, enabling models to perform well in real-world scenarios where test data distribution differs from training. In this work, we propose a novel approach called pseudo Source guided Target Clustering (pSTarC) addressing the relatively unexplored area of TTA under real-world domain shifts. This method draws inspiration from target clustering techniques and exploits the source classifier for generating pseudo-source samples. The test samples are strategically aligned with these pseudo-source samples, facilitating their clustering and thereby enhancing TTA performance. pSTarC operates solely within the fully test-time adaptation protocol, removing the need for actual source data. Experimental validation on VisDA, Office-Home, DomainNet-126, and CIFAR-100C verifies pSTarC's effectiveness, with significant improvements in prediction accuracy and efficient computational requirements. We also demonstrate the universality of pSTarC by showing its effectiveness for the continuous TTA framework.

Problem Setting

In Test-Time Adaptation (TTA), a model trained on source domain \(\mathcal{D}_s\) must adapt to a target domain \(\mathcal{D}_t\) at inference time, without access to source data or ground truth labels. In the more challenging Continual TTA (CTTA) setting, the target distribution shifts continuously over time: \(P_t^{(1)} \neq P_t^{(2)} \neq \ldots\).

Existing TTA methods typically require the source data to compute statistics or rely on batch normalization updates that become unstable at small batch sizes. pSTarC addresses these limitations by operating in the fully test-time protocol: no source data, no labels, no offline statistics — adapting purely from the test stream.

Key challenges addressed:

Source-free setting: No access to source data or source model internals beyond the final classifier.
Real-world domain shifts: Benchmarks include domain shifts across artistic styles, object categories, and image corruptions.
Memory efficiency: pSTarC maintains only a compact pseudo-source feature buffer (0.03M), vs. 4.67M for AdaContrast.

Proposed Method: pSTarC

pSTarC has two components: (1) a pseudo-source feature generation step that synthesizes source-like features from the frozen classifier, and (2) a target clustering step that aligns test samples to these pseudo-source features.

pSTarC framework overview — Overview of the pSTarC framework for test-time adaptation.

1. Pseudo Source Feature Generation

We randomly initialize a feature bank \(\mathbf{f}\) and iteratively optimize it, keeping the classifier \(h\) fixed, to minimize entropy while maximizing class diversity:

\[ \mathbf{f}^{*} = \arg\min_{\mathbf{f}}\; \mathcal{L}_{ent}(\mathbf{f}; h) + \beta\, \mathcal{L}_{div}(\mathbf{f}; h) \]

where

\[ \mathcal{L}_{ent}(\mathbf{f}; h) = -\frac{1}{N}\sum_{i=1}^N\sum_{k=1}^C p_k \log p_k \qquad \mathcal{L}_{div}(\mathbf{f}; h) = \sum_{k=1}^C \hat{p}_k \log \hat{p}_k \]

The resulting features act as pseudo-source prototypes: low-entropy, class-discriminative feature representations that stand in for unavailable source data.

2. Pseudo Source Guided Target Clustering

At test time, each test sample \(x_k\) with prediction \(p_k\) is adapted via three terms: (i) \(L_{aug}\): consistency between the prediction and its strong augmentation \(\tilde{p}_k\); (ii) \(L_{attr}\): attraction of low-entropy samples toward the nearest pseudo-source features \(\mathbf{p}^+\); (iii) \(L_{disp}\): dispersion of predictions across the batch to prevent collapse.

\[ \mathcal{L}_{\textrm{pSTarC}}(x_k) = \underbrace{-p_k^T\tilde{p}_k}_{L_{aug}} - \underbrace{\sum_{p_j^+\in\mathbf{p}^+} p_k^T p_j^+}_{L_{attr}} + \underbrace{\lambda\sum_{x_j\in\mathbf{x}_t} p_k^T p_j}_{L_{disp}} \]

Results

TTA: Comparison with State-of-the-Art

Method	VisDA	Office-Home	DomainNet-126	CIFAR-100C
Source	43.8	59.4	55.2	53.6
BN-Adapt	66.0	58.0	57.5	64.6
TENT	70.7	58.2	58.9	68.8
AdaContrast	78.7	60.2	62.6	65.9
pSTarC	81.9	63.5	63.7	69.5

Table 1: Average accuracy (%) on TTA benchmarks. pSTarC achieves the best accuracy on Office-Home, DomainNet-126, and CIFAR-100C without access to source data.

CTTA: Comparison with State-of-the-Art

Method	CIFAR-100C	DomainNet-126
BN-Adapt	64.6	58.1
TENT	39.1	59.7
CoTTA	67.5	60.2
AdaContrast	66.6	65.1
RMT	69.6	65.3
pSTarC	67.7	65.5

Table 2: Average accuracy (%) on CTTA benchmarks. TENT diverges on CIFAR-100C; pSTarC remains stable and outperforms all methods on DomainNet-126.

Ablation Study

\(L_{aug}\)	\(L_{attr}\)	\(L_{disp}\)	VisDA	DomainNet-126
✓			68.8	58.8
✓	✓		78.2	59.7
✓		✓	80.0	63.0
✓	✓	✓	81.9	63.7

Table 3: Ablation of loss components on VisDA and DomainNet-126. Each term contributes, with \(L_{disp}\) providing the largest gain on DomainNet-126.

Effect of Batch Size

Method	Batch=8	Batch=16	Batch=32	Batch=64	Batch=128
TENT	38.8	55.4	58.6	59.1	58.9
AdaContrast	50.1	57.9	60.8	62.4	62.4
pSTarC	54.1	59.2	61.3	63.8	63.7

Table 4: Accuracy (%) on DomainNet-126 across batch sizes. pSTarC consistently outperforms other methods and degrades more gracefully at small batch sizes.

BibTeX

@inproceedings{sreenivas2024pstarc,
  author    = {Sreenivas, Manogna and Chakrabarty, Goirik and Biswas, Soma},
  title     = {pSTarC: Pseudo Source Guided Target Clustering for Fully Test-Time Adaptation},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2024}
}