Code-Gemma-v1 / README.md
SaikatM's picture
Update README.md
d2b72a2 verified
---
library_name: transformers
tags:
- code
- gemma-2b
- finetune
- qlora
license: apache-2.0
datasets:
- SaikatM/Code-Platypus
language:
- en
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This model is a fine-tuned version of google/gemma-2b on an SaikatM/Code-Platypus dataset.
## Model Details
### Model Description
- **Finetuned from model:** google/gemma-2b
### Model Sources
Training code can be found here: https://github.com/Saikat-M/LLM-Finetuning
### Direct Use
* Code generation tasks
### Training Data
* Dataset: https://huggingface.co/datasets/SaikatM/Code-Platypus
* Source Dataset: https://huggingface.co/datasets/garage-bAInd/Open-Platypus
### Training Procedure
Used QLoRA from PEFT and used SFTTrainer.
#### Preprocessing
From the Open-Platypus dataset filtering-out rows which has leetcode_ne in it's data_source column.
#### Training Hyperparameters
LoraConfig(
r=4,
lora_alpha=2,
target_modules=modules,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
TrainingArguments(
output_dir="gemma-2b-code-platypus",
num_train_epochs=1,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
gradient_checkpointing=True,
optim="paged_adamw_8bit",
logging_steps=1,
save_strategy="epoch",
bf16=False,
tf32=False,
learning_rate=2e-4,
max_steps= 100,
max_grad_norm=0.3,
warmup_ratio=0.03,
lr_scheduler_type="constant",
push_to_hub=False,
report_to="tensorboard",
)
SFTTrainer(
model=model,
train_dataset=train_data,
eval_dataset=test_data,
dataset_text_field="text",
peft_config=lora_config,
max_seq_length=512,
tokenizer=tokenizer,
args=training_arguments,
)
#### Speeds, Sizes, Times
Took around 1 hour to train.
### Results
* Test Result 1:
```
Write a fucntion to sort a list in python
Answer:
def sort_list(list):
return sorted(list)<eos>
Response: None
```
* Test Result 2:
```
Write a function to count Consonants in a Given Word in Python
Response: None
```
* Test Result 3:
```
Write a function to count the number of vowels in a given string in Python.
Example 1:
Input: s = "leetcodeisgreat"
Output: 5
Explanation: The vowels are 'e', 'i', 'a', 'o', and 'u'.
Example 2:
Input: s = "leetcodeisgreat"
Output: 0
Explanation: The vowels are 'e', 'i', 'a', 'o', and 'u'.
Constraints:
* 1 <= s.length <= 100
* s consists of lowercase English letters.
def countVowels(s):
count = 0
for c in s:
if c in 'aeiou':
count += 1
return count
<eos>
Response: None
```
### Compute Infrastructure
Trained in Google Colab
#### Hardware
T4 GPU Hardware accelerator.