MICCAI 2026 · Early Accept

Two-Stage Cross-Domain Cervical Abnormality Screening with Cytopathological Image Synthesis and Knowledge Distillation

Jincheng Li1 Yuzhi He2 Yihui Zhan1 Xinmei Zhang1 Yifei Sun3 Zelin Liu4 Lichi Zhang4 Minye Shao5 Lili Zhao1
1 Nantong University  ·  2 Xidian University  ·  3 Zhejiang University  ·  4 Shanghai Jiao Tong University  ·  5 Durham University
* Equal contribution † Corresponding author: ylzh@ntu.edu.cn

Abstract

Cross-domain diagnosis remains a major challenge in cervical cell pathology due to pronounced domain shifts across institutions and the subtle visual differences among disease stages, which jointly impair model generalization. To address these issues, this paper proposes a two-stage framework for cross-domain cervical cell detection. In the first stage, we propose the Spatially-Continuous Unpaired Neural Schrödinger Bridge (SC-UNSB), which constructs a synthetic intermediate domain to mitigate cross-domain distribution shifts by modeling image translation as an entropy-regularized optimal transport process. In the second stage, we propose a dual-level feature alignment strategy within a knowledge distillation framework, which progressively aligns shallow structural features and deep semantic representations to facilitate the transfer of domain-invariant knowledge from the source to the target model. Experimental results demonstrate that the proposed method effectively mitigates domain shift and category ambiguity, improving the cross-domain detection performance.

Cross-domain Cell Detection Schrödinger Bridge Knowledge Distillation Medical Image Analysis Cytopathology
01

SC-UNSB

Spatially-continuous image synthesis via entropy-regularized optimal transport

02

Dual-Level Alignment

Progressive feature alignment from shallow structures to deep semantics

03

State-of-the-Art

26.9% mAP and 45.8% mAP50 on cross-domain cervical cell detection

Two-Stage Framework

Combining generative domain bridging with progressive feature alignment for cross-domain cervical cell detection

STAGE 01

SC-UNSB: Spatially-Continuous Image Synthesis

To mitigate cross-domain appearance discrepancies, we build upon the Unpaired Neural Schrödinger Bridge (UNSB), which formulates unpaired image translation as an entropy-regularized optimal transport problem. Our key innovation is the Dense Normalization (DN) module that ensures spatially continuous statistical fields.

Standard Instance Normalization computes statistics independently for each patch, causing boundary drift errors and tiling artifacts. SC-UNSB re-parameterizes statistical moments as continuous functions of pixel coordinates using bilinear interpolation from neighboring patches.

  • Entropy-regularized optimal transport for unpaired translation
  • Dense pixel-level moment estimation via 3×3 neighborhood interpolation
  • Eliminates tiling artifacts in ultra-high-resolution images
  • Preserves high-frequency biological details and cell morphology
Overview of SC-UNSB architecture
Figure 1. Overview of SC-UNSB. (a) Schrödinger Bridge learning process. (b) Dense pixel-level moment estimation. (c) Dispatcher and Dense Normalization architecture.
STAGE 02

Dual-Level Feature Alignment

We propose a dual-level feature alignment strategy within a knowledge distillation framework consisting of two complementary components:

Loose Feature Alignment (LFA) operates on shallow features to preserve structural information that is less sensitive to semantic variation but vulnerable to domain shift. We transform features into the frequency domain using a multi-scale low-pass filter.

Compact Feature Alignment (CFA) aligns high-level semantic representations from the penultimate layer. A 1×1 convolution projects features into a unified embedding space, promoting transfer of class-discriminative knowledge.

  • Frequency-domain alignment for shallow structural features (LFA)
  • Unified embedding space for semantic alignment (CFA)
  • Coarse-to-fine progressive knowledge transfer strategy
  • Joint optimization with detection task losses
Dual-level feature alignment model
Figure 2. Dual-level feature alignment. Source model guides target model via LFA for structural patterns and CFA for high-level semantics.

Results

Evaluated on CRIC (source) and ComparisonDetector (target) cervical cytology datasets

26.9%
Best mAP
45.8%
Best mAP50
11.38
Best NIQE ↓
0.754
Best HIST ↑
Method Ds CycleGAN CUT NOT i2i-Turbo UNSB SC-UNSB
Image Generation Quality
FID ↓ 241.05 147.28 132.61 177.43 154.65 143.43 135.31
KID×100 ↓ 10.896 1.831 1.365 5.104 2.514 2.411 1.807
NIQE ↓ 14.72 13.51 14.60 12.86 16.39 13.94 11.38
HIST ↑ 0.384 0.695 0.722 0.571 0.473 0.701 0.754
RetinaNet Detection
mAP ↑ 4.6% 10.5% 14.0% 9.3% 10.3% 17.8% 20.2%
mAP50 ↑ 9.3% 22.4% 29.3% 18.0% 26.8% 35.4% 41.5%
RetinaNet + LFA + CFA (Full Model)
mAP ↑ 12.6% 18.8% 22.7% 11.3% 15.1% 24.1% 26.9%
mAP50 ↑ 26.6% 31.8% 40.3% 21.9% 30.9% 42.6% 45.8%

Table 1: Comparison across generation quality and detection performance metrics.

Qualitative comparison
Figure 3. Qualitative comparison between different generators. SC-UNSB effectively mitigates tiling artifacts and boundary inconsistency, producing spatially coherent cell structures.
Radar chart comparison
Figure 4. (a) Radar chart of quantitative metrics. (b) Pixel-wise estimated statistics visualization. SC-UNSB produces smooth, spatially coherent moment fields.
Method KD DKD SPD Ours
mAP ↑ 21.7% 18.6% 22.3% 26.9%
mAP50 ↑ 40.6% 36.7% 43.4% 45.8%

Table 2: Comparison with knowledge distillation methods.

Conclusion

This work presents a two-stage framework for cross-domain cervical cell detection that explicitly addresses both appearance-level domain shift and representation-level feature misalignment. By constructing a spatially coherent intermediate domain through SC-UNSB and introducing dual-level feature alignment within a distillation framework, the proposed approach enhances the transfer of domain-invariant knowledge across institutions. These results highlight the potential of combining generative domain bridging with progressive feature alignment to enable cross-domain diagnosis in cervical cytopathology.

Acknowledgments

This work was supported by the Natural Science Foundation of Jiangsu Province (BK20251838) and the Nantong Science and Technology Program Project (JC2024055).

Cite This Work

@inproceedings{li2026twostage,
  title     = {Two-Stage Cross-Domain Cervical 
               Abnormality Screening with 
               Cytopathological Image Synthesis 
               and Knowledge Distillation},
  author    = {Li, Jincheng and He, Yuzhi and 
               Zhan, Yihui and Zhang, Xinmei and 
               Sun, Yifei and Liu, Zelin and 
               Zhang, Lichi and Shao, Minye and 
               Zhao, Lili},
  booktitle = {International Conference on Medical 
               Image Computing and Computer-Assisted 
               Intervention},
  year      = {2026},
  organization = {Springer}
}