nvidia
/

Hymba-1.5B-Base

Text Generation

Model card Files Files and versions Community

Xin Dong commited on Dec 3, 2024

Commit

5c73b29

·

1 Parent(s): 6abbf5e

add eval

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -106,6 +106,38 @@ print(f"Model response: {response}")
 ```
 ## Limitations

 ```
+## Evaluation
+We use [`LM Evaluation Harness`](https://github.com/EleutherAI/lm-evaluation-harness) to evaluate the model. The evaluation commands are as follows:
+```bash
+git clone --depth 1 https://github.com/EleutherAI/lm-evaluation-harness
+git fetch --all --tags
+git checkout tags/v0.4.4  # squad completion task is not compatible with the latest version
+cd lm-evaluation-harness
+pip install -e .
+lm_eval --model hf --model_args pretrained=nvidia/Hymba-1.5B-Base,dtype=bfloat16,trust_remote_code=True \
+     --tasks mmlu \
+     --num_fewshot 5 \
+     --batch_size 1 \
+     --output_path ./hymba_HF_base_lm-results \
+     --log_samples
+lm_eval --model hf --model_args pretrained=nvidia/Hymba-1.5B-Base,dtype=bfloat16,trust_remote_code=True \
+     --tasks arc_easy,arc_challenge,piqa,winogrande,hellaswag \
+     --num_fewshot 0 \
+     --batch_size 1 \
+     --output_path ./hymba_HF_base_lm-results \
+     --log_samples
+lm_eval --model hf --model_args pretrained=nvidia/Hymba-1.5B-Base,dtype=bfloat16,trust_remote_code=True \
+     --tasks squad_completion \
+     --num_fewshot 1 \
+     --batch_size 1 \
+     --output_path ./hymba_HF_base_lm-results \
+     --log_samples
+```
 ## Limitations