DGS-Net: Distillation-Guided Gradient Surgery for CLIP Fine-Tuning in AI-Generated Image Detection

1School of Computer Science, Nanjing University of Information Science and Technology
2University of Macau
Corresponding author

ICML 2026 Spotlight

Abstract

The rapid progress of generative models such as GANs and diffusion models has led to the widespread proliferation of AI-generated images, raising concerns about misinformation and trust erosion in digital media. Although large-scale multimodal models like CLIP offer strong transferable representations for detecting synthetic content, fine-tuning them often induces catastrophic forgetting, which degrades pre-trained priors and limits cross-domain generalization. To address this issue, we propose the Distillation-guided Gradient Surgery Network (DGS-Net), a novel framework that preserves transferable pre-trained priors while suppressing task-irrelevant components. Specifically, we introduce a gradient-space decomposition that separates harmful and beneficial descent directions during optimization. By projecting task gradients onto the orthogonal complement of harmful directions and aligning with beneficial ones distilled from a frozen CLIP encoder, DGS-Net achieves unified optimization of prior preservation and irrelevant suppression. Extensive experiments on 50 generative models demonstrate that our method outperforms state-of-the-art approaches by an average margin of 6.6%, achieving superior detection performance and generalization.

Key Contributions

  • To the best of our knowledge, this is the first to systematically diagnose catastrophic forgetting induced by CLIP fine-tuning in AI-generated image detection. We further introduce a novel gradient-space decomposition that disentangles CLIP representations into transferable pre-trained priors and task-irrelevant components.
  • We propose an innovative detection framework, DGS-Net, which employs distillation-guided gradient decoupling and alignment to preserve transferable pre-trained priors while suppressing task-irrelevant knowledge, thereby improving both detection accuracy and cross-domain generalization.
  • Extensive experiments across 50 diverse generative models demonstrate that our method consistently outperforms state-of-the-art approaches, achieving an average improvement of 6.6 in detection accuracy, highlighting its robustness and universality.
  • Method Overview

    DGS-Net pipeline
    Fig 2. Overview of the proposed Distillation-guided Gradient Surgery Network (DGS-Net). We introduce a gradient-space decomposition that separates harmful and beneficial descent directions during optimization. What's more, it consists of two core components: Orthogonal Suppression and Prior Alignment, which aim to suppress task-irrelevant representations and preserve transferable priors established during large-scale pre-training, substantially enhancing the generalization performance of AI-generated image detection.

    I. Orthogonal Suppression. The image gradients of the training network are orthogonally projected onto the subspace complementary to the harmful directions inferred from text gradients, thereby mitigating cross-modal interference.

    II. Prior Alignment. The coordinate-wise negative components of the frozen CLIP image gradients are introduced as lightweight alignment signals to reinforce beneficial pre-trained priors.

    Quantitative Results

    Detection table Detection table
    Detection Performance: DGS-Net outperforms existing methods on AIGCDetectBench, and AIGIBench datasets.

    Overall, quantitative results confirm the effectiveness of our DGS-Net, which reserves transferable pre-trained priors while suppresses task-irrelevant knowledge, thereby improving both detection accuracy and cross-domain generalization.

    BibTeX

    @article{yan2025dgs,
    title={DGS-Net: Distillation-Guided Gradient Surgery for CLIP Fine-Tuning in AI-Generated Image Detection},
    author={Yan, Jiazhen and Li, Ziqiang and Wang, Fan and Wang, Boyu and He, Ziwen and Fu, Zhangjie},
    journal={arXiv preprint arXiv:2511.13108},
    year={2025}
    }