colab notebook, no environment needed
简单的gpu colab笔记本,可以测试,不需要线下的环境配置和安装
总之换成gpu模式可以用啦,翻译一本小说什么的可读性不高(尴尬)
https://colab.research.google.com/drive/19rQG4ryrue-0g8KH4ATT0_o2-8tHLcIT?usp=sharing
Release Notes
this model is finetuned from mt5-base, training methods and datasets refers to larryvrh/mt5-translation-ja_zh
used a trimmed and fused dataset CCMatrix-v1-Ja_Zh 1e-4 for 1 epoch no weight decay,arraived at about 1.5 val loss, pretty decent for this behemoth tokenizer
spent about 26h on a modified 2080ti 22g graphic card, but size-wise this is safe to train on much smaller cards
reason for making this model
There are some issues in the original model by larryvrh, which includes:- long sentence repetition, doesn't recongize breaks
- dirty mix of numbers and periods
- translates to or from english "sometimes"
- a bit too big on smaller cards
They are generally all-parameter problems that i can only partially change with all-parameter finetune But I generally perfer to make a base model that doesn't have these issues to begin with. so here...
模型公开声明
这个模型由 mt5-translation-ja_zh 启发(其实就是在它上面改的),使用mt5-base,比原模型要小一些
使用了CCMatrix-v1-Ja_Zh, 1e-4学习率, 1 个epoch
大概在自己的2080ti 22g卡上跑了26小时,用高级的小卡会更快
制造这个模型的原因 larryvrh的原模型很不错了,但是有一些小问题
- 长句子会卷起来重复,而且不认识换行符
- 数字和标点会乱写
- 有时候会翻译或翻成英文,有时候会不翻
- 对于小的机器来说有点大了
当然还有别的问题,但是以上这些问题涉及到所有的param形状,我加lora上去它还是歪的,并不解决问题,像是之前那样整个模型finetune太精细不好把握 所以还是重新炼个丹把上面的都解决掉
简单的后端应用
还没稳定调试,慎用
A more precise example using it
使用指南
from transformers import pipeline
model_name="iryneko571/mt5-base-translation-ja_zh"
#pipe = pipeline("translation",model=model_name,tokenizer=model_name,repetition_penalty=1.4,batch_size=1,max_length=256)
pipe = pipeline("translation",
model=model_name,
repetition_penalty=1.4,
batch_size=1,
max_length=256
)
def translate_batch(batch, language='<-ja2zh->'): # batch is an array of string
i=0 # quickly format the list
while i<len(batch):
batch[i]=f'{language} {batch[i]}'
i+=1
translated=pipe(batch)
result=[]
i=0
while i<len(translated):
result.append(translated[i]['translation_text'])
i+=1
return result
inputs=[]
print(translate_batch(inputs))
Roadmap
- want some loras?
- build the platform better
how to find me
找到作者
Discord Server:
https://discord.gg/JmjPmJjA
If you need any help, a test server or just want to chat
如果需要帮助,需要试试最新的版本,或者只是为了看下我是啥,可以进channel看看(这边允许发布这个吗?)
- Downloads last month
- 37