deltalm-base-xlsum / README.md
hhhhzy's picture
Update README.md
7feb3da
|
raw
history blame
2.09 kB
metadata
datasets:
  - csebuetnlp/xlsum
language:
  - am
  - ar
  - az
  - bn
  - my
  - zh
  - en
  - fr
  - gu
  - ha
  - hi
  - ig
  - id
  - ja
  - rn
  - ko
  - ky
  - mr
  - ne
  - om
  - ps
  - fa
  - pcm
  - pt
  - pa
  - ru
  - gd
  - sr
  - si
  - so
  - es
  - sw
  - ta
  - te
  - th
  - ti
  - tr
  - uk
  - ur
  - uz
  - vi
  - cy
  - yo
multilinguality:
  - multilingual
pipeline_tag: summarization

Model Card for Model ID

This model is fine-tuned version of DeltaLM-base on the XLSum dataset , aiming for abstractive multilingual summarization.

It achieves the following results on the evaluation set:

  • rouge-1: 18.2
  • rouge-2: 7.6
  • rouge-l: 14.9
  • rouge-lsum: 14.7

Dataset desctiption

XLSum dataset is a comprehensive and diverse dataset comprising 1.35 million professionally annotated article-summary pairs from BBC, extracted using a set of carefully designed heuristics. The dataset covers 45 languages ranging from low to high-resource, for many of which no public dataset is currently available. XL-Sum is highly abstractive, concise, and of high quality, as indicated by human and intrinsic evaluation.

Languages

  • amharic
  • arabic
  • azerbaijani
  • bengali
  • burmese
  • chinese_simplified
  • chinese_traditional
  • english
  • french
  • gujarati
  • hausa
  • hindi
  • igbo
  • indonesian
  • japanese
  • kirundi
  • korean
  • kyrgyz
  • marathi
  • nepali
  • oromo
  • pashto
  • persian
  • pidgin
  • portuguese
  • punjabi
  • russian
  • scottish_gaelic
  • serbian_cyrillic
  • serbian_latin
  • sinhala
  • somali
  • spanish
  • swahili
  • tamil
  • telugu
  • thai
  • tigrinya
  • turkish
  • ukrainian
  • urdu
  • uzbek
  • vietnamese
  • welsh
  • yoruba

Training hyperparameters

The model trained with a p4d.24xlarge instance on aws sagemaker, with the following config:

  • model: deltalm base
  • batch size: 8
  • learning rate: 1e-5
  • number of epochs: 3
  • warmup steps: 500
  • weight decay: 0.01