brian-lim commited on
Commit
be423b2
Β·
1 Parent(s): 4d6a9e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -1,3 +1,32 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - brian-lim/smile_style_orca
5
+ language:
6
+ - ko
7
  ---
8
+ # Korean Style Transfer
9
+
10
+ This model is a fine-tuned version of [Synatra-7B-v0.3-dpo](https://huggingface.co/maywell/Synatra-7B-v0.3-dpo) using a Korean style dataset provided by Smilegate AI (https://github.com/smilegate-ai/korean_smile_style_dataset/tree/main).
11
+ Since the original dataset is tabular and not fit for training the LLM, I have preprocessed it into instruction-input-output format, which can be found (here)[https://huggingface.co/datasets/brian-lim/smile_style_orca].
12
+ The dataset is then fed into the ChatML template. Feel free to use my version of the dataset as needed.
13
+
14
+ ν•΄λ‹Ή λͺ¨λΈμ€ [Synatra-7B-v0.3-dpo](https://huggingface.co/maywell/Synatra-7B-v0.3-dpo) λͺ¨λΈμ„ 슀마일게이트 AIμ—μ„œ μ œκ³΅ν•˜λŠ” Smile style λ°μ΄ν„°μ…‹μœΌλ‘œ νŒŒμΈνŠœλ‹ ν–ˆμŠ΅λ‹ˆλ‹€.
15
+ κΈ°μ‘΄ 데이터셋은 ν…Œμ΄λΈ” ν˜•νƒœλ‘œ λ˜μ–΄μžˆμ–΄ ν•΄λ‹Ή 데이터λ₯Ό instruction-input-output ν˜•νƒœλ‘œ λ§Œλ“€μ—ˆκ³ , (μ—¬κΈ°)[https://huggingface.co/datasets/brian-lim/smile_style_orca]μ—μ„œ 확인 κ°€λŠ₯ν•©λ‹ˆλ‹€.
16
+ 데이터셋을 뢈러온 λ’€ ChatML ν˜•μ‹μ— 맞좰 ν›ˆλ ¨ 데이터 ꡬ좕을 ν•œ λ’€ μ§„ν–‰ν–ˆμŠ΅λ‹ˆλ‹€. ν•„μš”ν•˜μ‹œλ‹€λ©΄ 자유둭게 μ‚¬μš©ν•˜μ‹œκΈ° λ°”λžλ‹ˆλ‹€.
17
+
18
+ # Intended use & limitations
19
+
20
+ To be added
21
+
22
+ μΆ”κ°€ μ˜ˆμ •
23
+
24
+ # How to use
25
+
26
+ To be added
27
+
28
+ μΆ”κ°€μ˜ˆμ •
29
+
30
+ ---
31
+ license: apache-2.0
32
+ ---