File size: 2,857 Bytes
60daae7 3f8f253 6f3372a bb4286e 3f8f253 ae91eeb 3f8f253 176b1e3 3f8f253 7fcfc40 60daae7 9f48f08 60daae7 2897763 60daae7 2897763 60daae7 df6d77d 60daae7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
license: mit
tags:
- image-classification
- pytorch
- ViT
- transformers
- real-fake-detection
- deep-fake
- ai-detect
- ai-image-detection
metrics:
- accuracy
model-index:
- name: AI Image Detect Distilled
results:
- task:
type: image-classification
name: Image Classification
metrics:
- type: accuracy
value: 0.74
pipeline_tag: image-classification
library_name: transformers
---
# AI Detection Model
## Model Architecture and Training
Three separate models were initially trained:
1. Midjourney vs. Real Images
2. Stable Diffusion vs. Real Images
3. Stable Diffusion Fine-tunings vs. Real Images
Data preparation process:
- Used Google's Open Image Dataset for real images
- Described real images using BLIP (Bootstrapping Language-Image Pre-training)
- Generated Stable Diffusion images using BLIP descriptions
- Found similar Midjourney images based on BLIP descriptions
This approach ensured real and AI-generated images were as similar as possible, differing only in their origin.
The three models were then distilled into a small ViT model with 11.8 Million Parameters, combining their learned features for more efficient detection.
## Data Sources
- Google's Open Image Dataset: [link](https://storage.googleapis.com/openimages/web/index.html)
- Ivan Sivkov's Midjourney Dataset: [link](https://www.kaggle.com/datasets/ivansivkovenin/midjourney-prompts-image-part8)
- TANREI(NAMA)'s Stable Diffusion Prompts Dataset: [link](https://www.kaggle.com/datasets/tanreinama/900k-diffusion-prompts-dataset)
## Performance
- Validation Set: 74% accuracy
- Held out from training data to assess generalization
- Custom Real-World Set: 72% accuracy
- Composed of self-captured images and online-sourced images
- Designed to be more representative of internet-based images
- Comparative Analysis:
- Outperformed other popular AI detection models by 5 percentage points on both sets
- Other models achieved 89% and 79% on validation and real-world sets respectively
## Key Insights
1. Strong generalization on validation data (75% accuracy)
2. Good adaptability to diverse, real-world images (72% accuracy)
3. Consistent outperformance of other popular models
4. 10-point accuracy drop from validation to real-world set indicates room for improvement
5. Comprehensive training on multiple AI generation techniques contributes to model versatility
6. Focus on subtle differences in image generation rather than content disparities
## Future Directions
- Expand dataset with more diverse, real-world examples to bridge the performance gap
- Improve generalization to internet-sourced images
- Conduct error analysis on misclassified samples to identify patterns
- Integrate new AI image generation techniques as they emerge
- Consider fine-tuning for specific domains where detection accuracy is critical |