Image Classification
Transformers.js
ONNX
timm
Transformers
vit
detection
deepfake
forensics
deepfake_detection
community
opensight
Instructions to use onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('image-classification', 'onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX'); - timm
How to use onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX with timm:
import timm model = timm.create_model("hf_hub:onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX", pretrained=True) - Transformers
How to use onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX") model = AutoModelForImageClassification.from_pretrained("onnx-community/CommunityForensics-DeepfakeDet-ViT-ONNX") - Notebooks
- Google Colab
- Kaggle
Upload folder using huggingface_hub
Browse files- README.md +104 -0
- config.json +28 -0
- onnx/model.onnx +3 -0
- onnx/model_bnb4.onnx +3 -0
- onnx/model_fp16.onnx +3 -0
- onnx/model_int8.onnx +3 -0
- onnx/model_q4.onnx +3 -0
- onnx/model_q4f16.onnx +3 -0
- onnx/model_quantized.onnx +3 -0
- onnx/model_uint8.onnx +3 -0
- preprocessor_config.json +25 -0
README.md
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model:
|
| 3 |
+
- buildborderless/CommunityForensics-DeepfakeDet-ViT
|
| 4 |
+
library_name: transformers.js
|
| 5 |
+
license: mit
|
| 6 |
+
pipeline_tag: image-classification
|
| 7 |
+
tags:
|
| 8 |
+
- image-classification
|
| 9 |
+
- timm
|
| 10 |
+
- transformers
|
| 11 |
+
- detection
|
| 12 |
+
- deepfake
|
| 13 |
+
- forensics
|
| 14 |
+
- deepfake_detection
|
| 15 |
+
- community
|
| 16 |
+
- opensight
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# CommunityForensics-DeepfakeDet-ViT (ONNX)
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
This is an ONNX version of [buildborderless/CommunityForensics-DeepfakeDet-ViT](https://huggingface.co/buildborderless/CommunityForensics-DeepfakeDet-ViT). It was automatically converted and uploaded using [this Hugging Face Space](https://huggingface.co/spaces/onnx-community/convert-to-onnx).
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
## Usage with Transformers.js
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
See the pipeline documentation for `image-classification`: https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageClassificationPipeline
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
# Trained on 2.7M samples across 4,803 generators (see Training Data)
|
| 37 |
+
|
| 38 |
+
Model presented in [Community Forensics: Using Thousands of Generators to Train Fake Image Detectors](https://huggingface.co/papers/2411.04125).
|
| 39 |
+
|
| 40 |
+
**Uploaded for community validation as part of OpenSight** - An upcoming open-source framework for adaptive deepfake detection.
|
| 41 |
+
|
| 42 |
+
**Project OpenSight HF Spaces coming soon with an eval playground and eventually a leaderboard. Preview:**
|
| 43 |
+
|
| 44 |
+

|
| 45 |
+
|
| 46 |
+
## Model Details
|
| 47 |
+
### Model Description
|
| 48 |
+
Vision Transformer (ViT) model trained on the largest dataset to-date for detecting AI-generated images in forensic applications.
|
| 49 |
+
|
| 50 |
+
- **Developed by:** Jeongsoo Park and Andrew Owens, University of Michigan
|
| 51 |
+
- **Model type:** Vision Transformer (ViT-Small)
|
| 52 |
+
- **License:** MIT (compatible with CreativeML OpenRAIL-M referenced in [2411.04125v1.pdf])
|
| 53 |
+
- **Finetuned from:** timm/vit_small_patch16_384.augreg_in21k_ft_in1k
|
| 54 |
+
- **Adapted for HF** inference compatibility by AI Without Borders.
|
| 55 |
+
|
| 56 |
+
**HF Space will be open sourced shortly showcasing various ways to run ultra-fast inference. Make sure to follow us for updates, as we will be releasing a slew of projects in the coming weeks.**
|
| 57 |
+
|
| 58 |
+
### Links
|
| 59 |
+
- **Repository:** [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
|
| 60 |
+
- **Paper:** [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)
|
| 61 |
+
- **Project Page:** https://jespark.net/projects/2024/community_forensics
|
| 62 |
+
|
| 63 |
+
## Training Details
|
| 64 |
+
### Training Data
|
| 65 |
+
- 2.7mil images from 15+ generators, 4600+ models
|
| 66 |
+
- Over 1.15TB worth of images
|
| 67 |
+
|
| 68 |
+
### Training Hyperparameters
|
| 69 |
+
- **Framework:** PyTorch 2.0
|
| 70 |
+
- **Precision:** bf16 mixed
|
| 71 |
+
- **Optimizer:** AdamW (lr=5e-5)
|
| 72 |
+
- **Epochs:** 10
|
| 73 |
+
- **Batch Size:** 32
|
| 74 |
+
|
| 75 |
+
## Evaluation
|
| 76 |
+
### Unverified Testing Results
|
| 77 |
+
- Only unverified because we currently lack resources to evaluate a dataset over 1.4T large.
|
| 78 |
+
|
| 79 |
+
| Metric | Value |
|
| 80 |
+
|---------------|-------|
|
| 81 |
+
| Accuracy | 97.2% |
|
| 82 |
+
| F1 Score | 0.968 |
|
| 83 |
+
| AUC-ROC | 0.992 |
|
| 84 |
+
| FP Rate | 2.1% |
|
| 85 |
+
|
| 86 |
+

|
| 87 |
+
|
| 88 |
+
## Re-sampled and refined dataset
|
| 89 |
+
|
| 90 |
+
- **Coming soon™**
|
| 91 |
+
|
| 92 |
+
## Citation
|
| 93 |
+
**BibTeX:**
|
| 94 |
+
```bibtex
|
| 95 |
+
@misc{park2024communityforensics,
|
| 96 |
+
title={Community Forensics: Using Thousands of Generators to Train Fake Image Detectors},
|
| 97 |
+
author={Jeongsoo Park and Andrew Owens},
|
| 98 |
+
year={2024},
|
| 99 |
+
eprint={2411.04125},
|
| 100 |
+
archivePrefix={arXiv},
|
| 101 |
+
primaryClass={cs.CV},
|
| 102 |
+
url={https://arxiv.org/abs/2411.04125},
|
| 103 |
+
}
|
| 104 |
+
```
|
config.json
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"ViTForImageClassification"
|
| 4 |
+
],
|
| 5 |
+
"attention_probs_dropout_prob": 0.0,
|
| 6 |
+
"dtype": "float32",
|
| 7 |
+
"encoder_stride": 16,
|
| 8 |
+
"hidden_act": "gelu",
|
| 9 |
+
"hidden_dropout_prob": 0.0,
|
| 10 |
+
"hidden_size": 384,
|
| 11 |
+
"image_size": 384,
|
| 12 |
+
"initializer_range": 0.02,
|
| 13 |
+
"intermediate_size": 3072,
|
| 14 |
+
"layer_norm_eps": 1e-06,
|
| 15 |
+
"mlp_ratio": 4,
|
| 16 |
+
"model_type": "vit",
|
| 17 |
+
"num_attention_heads": 12,
|
| 18 |
+
"num_channels": 3,
|
| 19 |
+
"num_classes": 1,
|
| 20 |
+
"num_heads": 6,
|
| 21 |
+
"num_hidden_layers": 12,
|
| 22 |
+
"num_layers": 12,
|
| 23 |
+
"patch_size": 16,
|
| 24 |
+
"pooler_act": "tanh",
|
| 25 |
+
"pooler_output_size": 384,
|
| 26 |
+
"qkv_bias": true,
|
| 27 |
+
"transformers_version": "4.57.6"
|
| 28 |
+
}
|
onnx/model.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:55602c73f457858f0b48c19bfa03bc6bd218f37027cb1f6b569ee24e2f954d03
|
| 3 |
+
size 143845409
|
onnx/model_bnb4.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:263c46052167a15b981848465b8adb9f28dbd1f9ad8ecf8157cb05d876f7091b
|
| 3 |
+
size 24416892
|
onnx/model_fp16.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2458c8472f3f93ecbda4acbe137382b559845f76ecf142f0fcbc03a07c7de739
|
| 3 |
+
size 72106631
|
onnx/model_int8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a504b8ea9372e8c9be9e8e3aa0a7f0f2eff5ba3df067c3d5ed2fcdfe15eaba2c
|
| 3 |
+
size 36969938
|
onnx/model_q4.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:263c46052167a15b981848465b8adb9f28dbd1f9ad8ecf8157cb05d876f7091b
|
| 3 |
+
size 24416892
|
onnx/model_q4f16.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3ab58b5e202b4ad737dc7be5aae05f57deddb24c5c722d2d78e38adc913c33eb
|
| 3 |
+
size 21234115
|
onnx/model_quantized.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a504b8ea9372e8c9be9e8e3aa0a7f0f2eff5ba3df067c3d5ed2fcdfe15eaba2c
|
| 3 |
+
size 36969938
|
onnx/model_uint8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:86097e4bec2e4ff1d8d5ae575c3ffa0ae55a080cbba2181879fa1a068659e2b2
|
| 3 |
+
size 36969975
|
preprocessor_config.json
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"crop_pct": 0.875,
|
| 3 |
+
"crop_size": 384,
|
| 4 |
+
"do_convert_rgb": null,
|
| 5 |
+
"do_normalize": true,
|
| 6 |
+
"do_rescale": true,
|
| 7 |
+
"do_resize": true,
|
| 8 |
+
"image_mean": [
|
| 9 |
+
0.48145466,
|
| 10 |
+
0.4578275,
|
| 11 |
+
0.40821073
|
| 12 |
+
],
|
| 13 |
+
"image_processor_type": "ViTImageProcessor",
|
| 14 |
+
"image_std": [
|
| 15 |
+
0.26862954,
|
| 16 |
+
0.26130258,
|
| 17 |
+
0.27577711
|
| 18 |
+
],
|
| 19 |
+
"resample": 3,
|
| 20 |
+
"rescale_factor": 0.00392156862745098,
|
| 21 |
+
"size": {
|
| 22 |
+
"height": 440,
|
| 23 |
+
"width": 440
|
| 24 |
+
}
|
| 25 |
+
}
|