File size: 1,049 Bytes
ce399c6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e1a0d50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: apache-2.0
datasets:
- sitloboi2012/rvl_cdip_small_dataset
- sitloboi2012/rvl_cdip_large_dataset
language:
- en
metrics:
- accuracy
library_name: transformers
pipeline_tag: image-to-text
tags:
- DocumentAI
- ImageClassification
- Donut
---
# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

This model card aims to be a baseline model for using RVL-CDIP with Donut. The model has been trained on small scale dataset of RVL-CDIP (specically 100 images from this dataset).

## Model Details

The model using Donut with VisionEncoderDecoder and Transformers as the backbone model for an end-to-end Document Classification task

### Downstream Use [optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

This model can be use for fine-tuning task related Document Classification in different area like Food Document, Financial Document, etc.
For further task downstream fine-tune, please related to the orignal model from Naver.