Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective

View PDF HTML (experimental)

Abstract:The proliferation of AI-generated imagery poses escalating challenges for multimedia forensics, yet many existing detectors depend on assumptions about the internals of specific generative models, limiting their cross-model applicability. We introduce a self-supervised approach for detecting AI-generated images that leverages camera metadata – specifically exchangeable image file format (EXIF) tags – to learn features intrinsic to digital photography. Our pretext task trains a feature extractor solely on camera-captured photographs by classifying categorical EXIF tags (\eg, camera model and scene type) and pairwise-ranking ordinal and continuous EXIF tags (\eg, foc…

View PDF HTML (experimental)

Abstract:The proliferation of AI-generated imagery poses escalating challenges for multimedia forensics, yet many existing detectors depend on assumptions about the internals of specific generative models, limiting their cross-model applicability. We introduce a self-supervised approach for detecting AI-generated images that leverages camera metadata – specifically exchangeable image file format (EXIF) tags – to learn features intrinsic to digital photography. Our pretext task trains a feature extractor solely on camera-captured photographs by classifying categorical EXIF tags (\eg, camera model and scene type) and pairwise-ranking ordinal and continuous EXIF tags (\eg, focal length and aperture value). Using these EXIF-induced features, we first perform one-class detection by modeling the distribution of photographic images with a Gaussian mixture model and flagging low-likelihood samples as AI-generated. We then extend to binary detection that treats the learned extractor as a strong regularizer for a classifier of the same architecture, operating on high-frequency residuals from spatially scrambled patches. Extensive experiments across various generative models demonstrate that our EXIF-induced detectors substantially advance the state of the art, delivering strong generalization to in-the-wild samples and robustness to common benign image perturbations.


Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2512.05651 [cs.CV]
	(or arXiv:2512.05651v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2512.05651 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Nan Zhong [view email] [v1] Fri, 5 Dec 2025 11:53:18 UTC (41,460 KB)

Submission history

Similar Posts