Xin Dong commited on
Commit
5c73b29
·
1 Parent(s): 6abbf5e
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -106,6 +106,38 @@ print(f"Model response: {response}")
106
 
107
  ```
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
  ## Limitations
111
 
 
106
 
107
  ```
108
 
109
+ ## Evaluation
110
+ We use [`LM Evaluation Harness`](https://github.com/EleutherAI/lm-evaluation-harness) to evaluate the model. The evaluation commands are as follows:
111
+
112
+ ```bash
113
+ git clone --depth 1 https://github.com/EleutherAI/lm-evaluation-harness
114
+ git fetch --all --tags
115
+ git checkout tags/v0.4.4 # squad completion task is not compatible with the latest version
116
+ cd lm-evaluation-harness
117
+ pip install -e .
118
+
119
+ lm_eval --model hf --model_args pretrained=nvidia/Hymba-1.5B-Base,dtype=bfloat16,trust_remote_code=True \
120
+ --tasks mmlu \
121
+ --num_fewshot 5 \
122
+ --batch_size 1 \
123
+ --output_path ./hymba_HF_base_lm-results \
124
+ --log_samples
125
+
126
+ lm_eval --model hf --model_args pretrained=nvidia/Hymba-1.5B-Base,dtype=bfloat16,trust_remote_code=True \
127
+ --tasks arc_easy,arc_challenge,piqa,winogrande,hellaswag \
128
+ --num_fewshot 0 \
129
+ --batch_size 1 \
130
+ --output_path ./hymba_HF_base_lm-results \
131
+ --log_samples
132
+
133
+ lm_eval --model hf --model_args pretrained=nvidia/Hymba-1.5B-Base,dtype=bfloat16,trust_remote_code=True \
134
+ --tasks squad_completion \
135
+ --num_fewshot 1 \
136
+ --batch_size 1 \
137
+ --output_path ./hymba_HF_base_lm-results \
138
+ --log_samples
139
+ ```
140
+
141
 
142
  ## Limitations
143