File size: 2,572 Bytes
12681ff
 
 
 
 
 
 
 
 
d23df37
12681ff
 
 
 
a3253ec
12681ff
 
 
e94598a
ed83b7e
722e776
ed83b7e
12681ff
 
8236ee0
 
12681ff
8236ee0
12681ff
580f81d
 
8236ee0
 
 
 
 
 
 
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
12681ff
8236ee0
 
 
 
 
 
 
12681ff
8236ee0
12681ff
8236ee0
 
6bff02f
 
 
 
 
 
 
 
 
12681ff
8236ee0
12681ff
8236ee0
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
language: en
license: mit
library_name: transformers
tags:
- summarization
- bart
datasets: ccdv/arxiv-summarization
model-index:
- name: BARTxiv
  results:
  - task:
      type: summarization
    dataset:
      name: arxiv-summarization
      type: ccdv/arxiv-summarization
      split: validation
    metrics:
    - type: rouge1
      value: 41.70204016592095
    - type: rouge2
      value: 15.134827404979639
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# BARTxiv

See the model implementation [here](https://interrsect.web.app).

This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the [arxiv-summarization](https://huggingface.co/datasets/ccdv/arxiv-summarization) dataset.
It achieves the following results on the validation set:
- Loss: 0.86
- Rouge1: 41.70
- Rouge2: 15.13
- Rougel: 22.85
- Rougelsum: 37.77

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-6
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adafactor
- num_epochs: 9

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| 1.24          | 1.0   | 1073 | 1.24            | 38.32   | 12.80   | 20.55   | 34.50     |
| 1.04          | 2.0   | 2146 | 1.04            | 39.65   | 13.74   | 21.28   | 35.83     |
| 0.979         | 3.0   | 3219 | 0.98            | 40.19   | 14.30   | 21.87   | 36.38     |
| 0.970         | 4.0   | 4292 | 0.97            | 40.87   | 14.44   | 22.14   | 36.89     |
| 0.918         | 5.0   | 5365 | 0.92            | 41.17   | 14.94   | 22.54   | 37.40     |
| 0.901         | 6.0   | 6438 | 0.90            | 41.02   | 14.65   | 22.46   | 37.05     |
| 0.889         | 7.0   | 7511 | 0.89            | 41.32   | 15.09   | 22.64   | 37.42     |
| 0.900         | 8.0   | 8584 | 0 .90           | 41.23   | 15.02   | 22.67   | 37.28     |
| 0.869         | 9.0   | 9657 | 0.87            | 41.70   | 15.13   | 22.85   | 37.77     |

### Framework versions

- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1