jacoballessio
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,57 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
# AI Detection Model
|
5 |
+
|
6 |
+
## Model Architecture and Training
|
7 |
+
|
8 |
+
Three separate models were initially trained:
|
9 |
+
1. Midjourney vs. Real Images
|
10 |
+
2. Stable Diffusion vs. Real Images
|
11 |
+
3. Stable Diffusion Fine-tunings vs. Real Images
|
12 |
+
|
13 |
+
Data preparation process:
|
14 |
+
- Used Google's Open Image Dataset for real images
|
15 |
+
- Described real images using BLIP (Bootstrapping Language-Image Pre-training)
|
16 |
+
- Generated Stable Diffusion images using BLIP descriptions
|
17 |
+
- Found similar Midjourney images based on BLIP descriptions
|
18 |
+
|
19 |
+
This approach ensured real and AI-generated images were as similar as possible, differing only in their origin.
|
20 |
+
|
21 |
+
The three models were then distilled into a single EfficientNet model, combining their learned features for more efficient detection.
|
22 |
+
|
23 |
+
## Data Sources
|
24 |
+
|
25 |
+
- Google's Open Image Dataset: [link](https://storage.googleapis.com/openimages/web/index.html)
|
26 |
+
- Ivan Sivkov's Midjourney Dataset: [link](https://www.kaggle.com/datasets/ivansivkovenin/midjourney-prompts-image-part8)
|
27 |
+
- TANREI(NAMA)'s Stable Diffusion Prompts Dataset: [link](https://www.kaggle.com/datasets/tanreinama/900k-diffusion-prompts-dataset)
|
28 |
+
|
29 |
+
## Performance
|
30 |
+
|
31 |
+
- Validation Set: 94% accuracy
|
32 |
+
- Held out from training data to assess generalization
|
33 |
+
|
34 |
+
- Custom Real-World Set: 84% accuracy
|
35 |
+
- Composed of self-captured images and online-sourced images
|
36 |
+
- Designed to be more representative of internet-based images
|
37 |
+
|
38 |
+
- Comparative Analysis:
|
39 |
+
- Outperformed other popular AI detection models by 5 percentage points on both sets
|
40 |
+
- Other models achieved 89% and 79% on validation and real-world sets respectively
|
41 |
+
|
42 |
+
## Key Insights
|
43 |
+
|
44 |
+
1. Strong generalization on validation data (94% accuracy)
|
45 |
+
2. Good adaptability to diverse, real-world images (84% accuracy)
|
46 |
+
3. Consistent outperformance of other popular models
|
47 |
+
4. 10-point accuracy drop from validation to real-world set indicates room for improvement
|
48 |
+
5. Comprehensive training on multiple AI generation techniques contributes to model versatility
|
49 |
+
6. Focus on subtle differences in image generation rather than content disparities
|
50 |
+
|
51 |
+
## Future Directions
|
52 |
+
|
53 |
+
- Expand dataset with more diverse, real-world examples to bridge the performance gap
|
54 |
+
- Improve generalization to internet-sourced images
|
55 |
+
- Conduct error analysis on misclassified samples to identify patterns
|
56 |
+
- Integrate new AI image generation techniques as they emerge
|
57 |
+
- Consider fine-tuning for specific domains where detection accuracy is critical
|