winglian commited on
Commit
c989146
·
1 Parent(s): cbd7499

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -31,7 +31,7 @@ Manticore 13B Chat is a Llama 13B model fine-tuned on the following datasets alo
31
 
32
  **Manticore 13B Chat was trained on 25% of the datasets below. The datasets were merged, shuffled, and then sharded into 4 parts.**
33
 
34
- - de-duped pygmalion dataset
35
  - [riddle_sense](https://huggingface.co/datasets/riddle_sense) - instruct augmented
36
  - hellaswag, updated for detailed explanations w 30K+ rows
37
  - [gsm8k](https://huggingface.co/datasets/gsm8k) - instruct augmented
@@ -52,7 +52,9 @@ Manticore 13B
52
  Not added from Manticore 13B:
53
  - mmlu - mmlu datasets were not added to this model as the `test` split is used for benchmarks
54
 
 
55
 
 
56
  # Demo
57
 
58
  Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.
 
31
 
32
  **Manticore 13B Chat was trained on 25% of the datasets below. The datasets were merged, shuffled, and then sharded into 4 parts.**
33
 
34
+ - de-duped pygmalion dataset, filtered down to RP data
35
  - [riddle_sense](https://huggingface.co/datasets/riddle_sense) - instruct augmented
36
  - hellaswag, updated for detailed explanations w 30K+ rows
37
  - [gsm8k](https://huggingface.co/datasets/gsm8k) - instruct augmented
 
52
  Not added from Manticore 13B:
53
  - mmlu - mmlu datasets were not added to this model as the `test` split is used for benchmarks
54
 
55
+ # Shoutouts
56
 
57
+ Special thanks to Nanobit for helping with Axolotl, TheBloke for quantizing these models are more accessible to all, ehartford for cleaned datasets, and 0x000011b for the RP dataset.
58
  # Demo
59
 
60
  Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.