So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection

1 University of Liverpool, UK 2 ByteDance 3 Zhejiang University
4 National University of Singapore 5 The Hong Kong University of Science and Technology

Overview of So-Fake Dataset

So-Fake-R1

Overview of So-Fake Dataset. (a) Comparison of forgery detection methods from 2020 to 2025 shows our dataset includes the most recent and the largest number of methods. (b) So-Fake-Set covers diverse real social media scenarios, including 12 different categories. (c) Different generative methods and visual examples in So-Fake-Set and So-Fake-OOD.

Abstract

Recent advances in AI-powered generative models have enabled the creation of increasingly realistic synthetic images, posing significant risks to information integrity and public trust on social media platforms. While robust detection frameworks and diverse, large-scale datasets are essential to mitigate these risks, existing academic efforts remain limited in scope: current datasets lack the diversity, scale, and realism required for social media contexts, while detection methods struggle with generalization to unseen generative technologies.

To bridge this gap, we introduce So-Fake-Set, a comprehensive social media-oriented dataset with over 2 million high-quality images, diverse generative sources, and photorealistic imagery synthesized using 35 state-of-the-art generative models. To rigorously evaluate cross-domain robustness, we establish a novel and large-scale (100K) out-of-domain benchmark (So-Fake-OOD) featuring synthetic imagery from commercial models explicitly excluded from the training distribution, creating a realistic testbed for evaluating real-world performance.

Leveraging these resources, we present So-Fake-R1, an advanced vision-language framework that employs reinforcement learning for highly accurate forgery detection, precise localization, and explainable inference through interpretable visual rationales. Extensive experiments show that So-Fake-R1 outperforms the second-best method, with a 1.3% gain in detection accuracy and a 4.5% increase in localization IoU. By integrating a scalable dataset, a challenging OOD benchmark, and an advanced detection framework, this work establishes a new foundation for social media-centric forgery detection research.

Comparison with Existing Image Deepfake Datasets

Comparison with existing image deepfake datasets

Comparison with existing image forgery datasets. “–” in #Methods indicates the number of generative methods was not specified; “#” denotes an exact count. Column abbreviations: MultiCls = Multiclasses, Expl. = Explanation, OOD = Out-of-distribution benchmark.

Generative Methods Used in So-Fake-Dataset

Comparison with existing image deepfake datasets

Details of generative methods used in constructing So-Fake-Set and So-Fake-OOD. Column abbreviations: Set = So-Fake-Set, OOD = So-Fake-OOD, F = fully synthetic images, T = tampered images. Real data source abbreviations: F30k = Flickr30k, OI = OpenImages, OF = OpenForensics.

Overview of Dataset Construction

Overview of dataset construction.

Overview of dataset construction. (a) Data sources and visual examples for So-Fake-Set and So-Fake-OOD. (b) Generation process of full synthetic images (left) and tampered images (right).

Visual Cases

Figure: Visual Examples of Full Synthetic Images from So-Fake-OOD

This figure showcases diverse examples from So-Fake-OOD, a benchmark specifically curated to evaluate out-of-distribution generalization. These images are collected from real Reddit subreddits, capturing complex, authentic content across a wide range of topics. The diversity and realism of So-Fake-OOD present significant challenges for forgery detection models, making it ideal for testing robustness beyond curated datasets.

Figure: Visual Examples of Tampered Images from So-Fake-OOD

Method: So-Fake-R1

Method of So-Fake-R1.

(a): Overview of the So-Fake-R1 training process; \textbf{(b)}: The detailed So-Fake-R1 GRPO training process. The example shows a tampered image where a boy has been manipulated.

Video Presentation

Experiments

Visual Examples of So-Fake-R1

BibTeX

@misc{huang2025sofakebenchmarkingexplainingsocial,
      title={So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection}, 
      author={Zhenglin Huang and Tianxiao Li and Xiangtai Li and Haiquan Wen and Yiwei He and Jiangning Zhang and Hao Fei and Xi Yang and Xiaowei Huang and Bei Peng and Guangliang Cheng},
      year={2025},
      archivePrefix={arXiv},
      url={https://arxiv.org/abs/2505.18660}, 
}