KoT5_news_summarization

Model description

<<20221021 Commit>>

ν”„λ‘œμ νŠΈμš©μœΌλ‘œ λ‰΄μŠ€ μš”μ•½ λͺ¨λΈ νŠΉν™”λœ λͺ¨λΈμ„ λ§Œλ“€κΈ° μœ„ν•΄ lcw99λ‹˜μ˜ t5-base-korean-text-summary λͺ¨λΈμ— μΆ”κ°€μ μœΌλ‘œ daekeun-mlλ‹˜μ΄ μ œκ³΅ν•΄μ£Όμ‹  naver-news-summarization-ko λ°μ΄ν„°μ…‹μœΌλ‘œ νŒŒμΈνŠœλ‹ ν–ˆμŠ΅λ‹ˆλ‹€.

ν˜„μž¬ μ œκ°€ 가지고 μžˆλŠ” λ‰΄μŠ€ λ°μ΄ν„°λ‘œ μΆ”κ°€ ν•™μŠ΅ 진행 μ˜ˆμ •μž…λ‹ˆλ‹€. μ§€μ†μ μœΌλ‘œ λ°œμ „μ‹œμΌœ 쒋은 μ„±λŠ₯의 λͺ¨λΈμ„ κ΅¬ν˜„ν•˜κ² μŠ΅λ‹ˆλ‹€. κ°μ‚¬ν•©λ‹ˆλ‹€.

μ‹€ν–‰ν™˜κ²½

  • Google Colab Pro
  • CPU : Intel(R) Xeon(R) CPU @ 2.20GHz
  • GPU : A100-SXM4-40GB

# Python Code
from transformers import AutoTokenizer
from transformers import AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("noahkim/KoT5_news_summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("noahkim/KoT5_news_summarization")

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.4513 1.0 2775 0.4067
0.42 2.0 5550 0.3933
0.395 3.0 8325 0.3864
0.3771 4.0 11100 0.3872

Framework versions

  • Transformers 4.23.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.6.1
  • Tokenizers 0.13.1
Downloads last month
205
Inference Examples
Inference API (serverless) has been turned off for this model.