Official repository for the paper "Simplicity Prevails: The Emergence of Generalizable AIGI Detection in Visual Foundation Models"(https://arxiv.org/pdf/2602.01738)

If you have any questions, please feel free to open a discussion in the Community tab. For direct inquiries, you can also reach out to us via email at [email protected].

VFM Baselines Release

This directory contains the 7 vision foundation model baselines used in the paper:

  • MetaCLIP-Linear
  • MetaCLIP2-Linear
  • SigLIP-Linear
  • SigLIP2-Linear
  • PE-CLIP-Linear
  • DINOv2-Linear
  • DINOv3-Linear

Contents

  • models.py: unified model-loading code for all 7 baselines
  • test_vfm_baselines.py: unified evaluation script
  • weights/: released checkpoints
  • core/vision_encoder/: vendored PE vision encoder code required by PE-CLIP-Linear

Model Names

The unified loader and test script accept these names:

  • metacliplin
  • metaclip2lin
  • sigliplin
  • siglip2lin
  • pelin
  • dinov2lin
  • dinov3lin

The paper names such as MetaCLIP-Linear and DINOv3-Linear are also accepted.

Usage

Evaluate a single model:

python test_vfm_baselines.py \
  --model sigliplin \
  --real-dir /path/to/0_real \
  --fake-dir /path/to/1_fake \
  --max-samples 100

Evaluate all 7 models:

python test_vfm_baselines.py \
  --model all \
  --real-dir /path/to/0_real \
  --fake-dir /path/to/1_fake \
  --max-samples 100

Optional arguments:

  • --checkpoint: override the default checkpoint for single-model evaluation
  • --batch-size: batch size for evaluation
  • --num-workers: dataloader workers
  • --device: explicit device such as cuda:0 or cpu
  • --save-json: save results to a JSON file

Dependencies

The release code expects these Python packages:

  • torch
  • torchvision
  • transformers
  • scikit-learn
  • Pillow
  • timm
  • einops
  • ftfy
  • regex
  • huggingface_hub

Notes

  • The clip-family and DINO-family baselines instantiate the backbone from Hugging Face model configs and then load the released checkpoint.
  • PE-CLIP-Linear uses the vendored core/vision_encoder code in this directory.
  • The checkpoints in weights/ are arranged locally for packaging convenience. For public release, they can be uploaded as the same filenames.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Lunahera/simplicityprevails