So-Fake

Abstract

Recent advances in AI-powered generative models have enabled the creation of increasingly realistic synthetic images, posing significant risks to information integrity and public trust on social media platforms. While robust detection frameworks and diverse, large-scale datasets are essential to mitigate these risks, existing academic efforts remain limited in scope: current datasets lack the diversity, scale, and realism required for social media contexts, and evaluation protocols rarely account for explanation or out-of-domain generalization.

To bridge this gap, we introduce So-Fake, a comprehensive social media-oriented dataset for forgery detection consisting of two key components. First, we present So-Fake-Set, a large-scale dataset with over 2 million photorealistic images from diverse generative sources, synthesized using a wide range of generative models. Second, to rigorously evaluate cross-domain robustness, we establish So-Fake-OOD, a novel and large-scale (100K) out-of-domain benchmark sourced from real social media platforms and featuring synthetic imagery from commercial models explicitly excluded from the training distribution, creating a realistic testbed that mirrors actual deployment scenarios. Leveraging these complementary datasets, we present So-Fake-R1, a baseline framework that applies reinforcement learning to encourage interpretable visual rationales. Experiments show that So-Fake surfaces substantial challenges for existing methods. By integrating a large-scale dataset, a realistic out-of-domain benchmark, and a multi-dimensional evaluation protocol, So-Fake establishes a new foundation for social media forgery detection research.

Comparison with Existing Image Deepfake Datasets

Comparison with existing image forgery datasets. “–” in #Methods indicates the number of generative methods was not specified; “#” denotes an exact count. Column abbreviations: MultiCls = Multiclasses, Expl. = Explanation, OOD = Out-of-distribution benchmark.

Generative Methods Used in So-Fake-Dataset

Details of generative methods used in constructing So-Fake-Set and So-Fake-OOD. Column abbreviations: Set = So-Fake-Set, OOD = So-Fake-OOD, F = fully synthetic images, T = tampered images. Real data source abbreviations: F30k = Flickr30k, OI = OpenImages, OF = OpenForensics.

Overview of Dataset Construction

Overview of dataset construction. (a) Data sources and visual examples for So-Fake-Set and So-Fake-OOD. (b) Generation process of full synthetic images (left) and tampered images (right).

Visual Cases

Full Synthetic Images

Figure: Visual Examples of Full Synthetic Images from So-Fake-OOD

This figure showcases diverse examples from So-Fake-OOD, a benchmark specifically curated to evaluate out-of-distribution generalization. These images are collected from real Reddit subreddits, capturing complex, authentic content across a wide range of topics. The diversity and realism of So-Fake-OOD present significant challenges for forgery detection models, making it ideal for testing robustness beyond curated datasets.

Tampered Images

Figure: Visual Examples of Tampered Images from So-Fake-OOD

This figure showcases diverse examples from So-Fake-OOD, a benchmark specifically curated to evaluate out-of-distribution generalization. These images are collected from real Reddit subreddits, capturing complex, authentic content across a wide range of topics. The diversity and realism of So-Fake-OOD present significant challenges for forgery detection models, making it ideal for testing robustness beyond curated datasets.

Method: So-Fake-R1

(a): Overview of the So-Fake-R1 training process; \textbf{(b)}: The detailed So-Fake-R1 GRPO training process. The example shows a tampered image where a boy has been manipulated.

BibTeX

@misc{huang2025sofakebenchmarkingexplainingsocial,
      title={So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection}, 
      author={Zhenglin Huang and Tianxiao Li and Xiangtai Li and Haiquan Wen and Yiwei He and Jiangning Zhang and Hao Fei and Xi Yang and Xiaowei Huang and Bei Peng and Guangliang Cheng},
      year={2025},
      archivePrefix={arXiv},
      url={https://arxiv.org/abs/2505.18660}, 
}

So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection

Overview of So-Fake Dataset

Abstract

Comparison with Existing Image Deepfake Datasets

Generative Methods Used in So-Fake-Dataset

Overview of Dataset Construction

Visual Cases

Method: So-Fake-R1

Video Presentation

Experiments

Performance comparison among different methods on So-Fake-Set.

Performance comparison on So-Fake-OOD with both zero-shot and fine-tune settings.

Visual Examples of So-Fake-R1

BibTeX