brildev7
/

gemma-7b-it-summarization-ko-sft-qlora

PEFT

Safetensors

trl

sft

Generated from Trainer

Model card Files Files and versions Community

brildev7 commited on Mar 3, 2024

Commit

c533c9d

verified ·

1 Parent(s): b35cad3

Update README.md

Browse files

Files changed (1) hide show

README.md +45 -187

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
@@ -12,190 +12,48 @@ tags: []
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+tags: [summarization]
 ---
 # Model Card for Model ID
 ## Model Details
 ### Model Description
+Korean summarization finetune model based on gemma-7b-it model
+- **Finetuned by:** [Kang Seok Ju]
+### Inference Examples
+from dataclasses import dataclass, field
+from typing import Optional
+import torch
+from transformers import AutoTokenizer, HfArgumentParser, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments
+from datasets import load_dataset
+from peft import LoraConfig
+from trl import SFTTrainer
+model_id = "brildev7/gemma-7b-it-finetune-summarization-ko"
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_compute_dtype=torch.float16,
+    bnb_4bit_quant_type="nf4"
+)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    device_map={"":0},
+    quantization_config=quantization_config,
+    torch_dtype=torch.float32,
+)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+tokenizer.pad_token_id = tokenizer.eos_token_id
+tokenizer.padding_side = 'right'
+passage = "AP·AFP 통신 등 외신은 아시아 최고 부자로 꼽히는 인도의 무케시 암바니 릴라이언스 인더스트리 회장이 막내아들의 초호화 결혼식을 준비하면서 전세계 억만장자와 할리우드 스타 등 유명 인사들을 대거 초대했다고 2일(현지시간) 보도했다.  이에 따르면 그의 28세 아들인 아난트 암바니는 오는 7월 인도 서부 구자라트주 잠나가르에서 오랜 연인인 라디카 머천트와 결혼할 예정이다. 머천트는 인도 제약회사 앙코르 헬스케어의 최고경영자(CEO) 바이렌 머천트의 딸이다.  사흘간 진행될 두 사람의 결혼식엔 마크 저커버그 메타 CEO, 빌 게이츠 마이크로소프트(MS) 창업자, 순다르 피차이 구글 CEO, 도널드 트럼프 전 미국 대통령의 딸 이방카 트럼프 등 1200명의 유명 인사들이 참석할 예정이다.  또 팝스타 리한나와 마술사 데이비드 블레인 등의 공연도 열릴 예정이다. 인디아 투데이는 리한나가 이 행사 출연료로 900만 달러(약 120억 원)를 제안받았다고 보도했다.   지난 6일 서울김포비즈니스항공센터를 통해 아랍메미리트연합(UAE)으로 출국하고 있는 이재용 삼성전자 회장. 뉴시스   이번 결혼식에 참석하는 하객들은 정글을 테마로 한 의상을 입고 아난트 암바니가 운영하는 동물 구조 센터를 방문한다. ‘숲의 별’이라는 뜻의 ‘반타라’로 알려진 이곳은 면적만 여의도의 4배 규모인 12㎢에 달하며 코끼리 등 각종 멸종 위기에 있는 동물들이 서식한다. 또 매일 초호화 파티가 열리며 그때마다 새로운 드레스 코드에 맞춰 옷을 입어야 한다.  이번 결혼식을 위해 암바니는 힌두교 사원 단지를 새로 건설 중이며, 결혼식 파티에만 2500여 개의 음식이 제공될 예정이다.  암바니는 2018년과 2019년에도 각각 딸과 아들을 결혼시키면서 초호화 파티를 열어 전 세계의 이목을 집중시켰다.  2018년 12월에 열린 딸 이샤 암바니의 결혼식 축하연에는 힐러리 클린턴 전 미국 국무장관과 이재용 삼성전자 회장, 언론 재벌 루퍼트 머독의 차남 제임스 머독 등이 참석했고, 축하 공연은 팝스타 비욘세가 맡았다. 암바니 회장은 이 결혼식에만 1억 달러(약 1336억 원)를 사용한 것으로 전해졌다.  2019년 장남 아카시 암바니의 결혼식에도 토니 블레어 전 영국 총리를 비롯해 순다르 피차이와 반기문 전 유엔사무총장 등이 참석했다. 이재용 회장은 이 때 인도 전통 의상을 입고 참석한 사진이 공개돼 화제가 되기도 했다.  암바니 회장은 석유와 가스, 석유화학 분야에서 성공해 많은 돈을 모았고 2016년 릴라이언스 지오를 앞세워 인도 통신 시장에도 진출, 인도 시장을 사실상 평정하면서 아시아 최고 갑부 대열에 올라섰다.  그가 소유한 인도 뭄바이의 27층짜리 저택 ‘안탈리아’는 세계에서 가장 비싼 개인 주택으로 꼽힌다."
+text = f"문장: {passage}\n요약 :"
+device = "cuda:0"
+inputs = tokenizer(text, return_tensors="pt").to(device)
+outputs = model.generate(**inputs,
+                        max_new_tokens=512,
+                        temperature=1,
+                        use_cache=False)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))