--- base_model: meta-llama/Llama-3.2-3B-Instruct library_name: peft language: - ko - en metrics: - accuracy pipeline_tag: text-classification --- # Model Card for Model ID - llama3.2-3B 모델을 prompt를 고정하고 lora 방식으로 학습한 모델입니다. - 기쁨, 당황, 분노, 불안, 상처, 슬픔 총 6가지 감정을 학습하였습니다. - 데이터는 AIHUB의 [감성 대화 말뭉치](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=86)를 사용했습니다. - 나이와 성별도 학습시 사용했습니다. ## Uses ``` import re import torch from transformers import AutoTokenizer from peft import AutoPeftModelForCausalLM model = None tokenizer = None device = None PROMPT="""<|prompt|>You are an AI assistant tasked with analyzing the emotional content of a diary entry. Your goal is to determine the most closely matching emotion from a predefined list. Here is the diary entry you need to analyze: age: {age} | gender: {gender} | diary: {sentence} Please carefully read and analyze the content of this diary entry. Consider the overall tone, the events described, and the language used by the writer. Based on your analysis, choose the emotion that best matches the overall sentiment of the diary entry from the following list: ['분노', '불안', '상처', '슬픔', '당황', '기쁨'] Translate these emotions to English for your understanding: ['분노(anger)', '불안(anxiety)', '상처(hurt)', '슬픔(sadness)', '당황(embarrassment)', '기쁨(happiness)'] After you've made your decision, respond with only the chosen emotion in Korean. Do not provide any explanation or additional text. Your response should be formatted as follows: [chosen emotion in korean] Once you've provided the emotion, end the conversation. Do not engage in any further dialogue or provide any additional information. <|assistant|>""" def load_model(): global model, tokenizer, device device = torch.device("cuda" if torch.cuda.is_available() else "cpu") path = './llama-3.2-3B-sentiment-kr-LoRA' tokenizer = AutoTokenizer.from_pretrained(path) model = AutoPeftModelForCausalLM.from_pretrained( path, attn_implementation="flash_attention_2", torch_dtype=torch.float16, device_map=device, ) model.eval() def generate(text, age, gender): global model, tokenizer, device text = PROMPT.format(age=age, gender=gender, sentence=text) inputs = tokenizer(text, return_tensors="pt").to(device) with torch.no_grad(): outputs = model.generate(**inputs, max_new_tokens=11, pad_token_id=tokenizer.pad_token_id) decoded_output = tokenizer.decode(outputs[0]) try: pred = decoded_output.split("<|assistant|>")[1] pred = re.search(r'(.*?)', pred).group(1) except: pred = 'error' return pred print(generate("오늘 친구랑 싸웠어.", "", "")) ``` ## Accuracy 데이터 학습시 일부를 테스트용 데이터로 정확도 측정 결과 약 70%를 달성했습니다. ### Framework versions - PEFT 0.13.0