--- library_name: transformers license: apache-2.0 base_model: abacaj/llama-161M-100B pipeline_tag: text-generation --- # QuantFactory/llama-161M-100B-GGUF This is quantized version of [abacaj/llama-161M-100B](https://huggingface.co/abacaj/llama-161M-100B) created using llama.cpp # Model Description Trained on 100B tokens. - 1e-3 LR - 0.1 wd - WSD scheduler with 10% decay - 80% code, 10% NL, 10% instruction data - Dataset decontaminated against popular benchmarks following [bigcode](https://github.com/bigcode-project/bigcode-dataset/tree/main/decontamination) - 8x3090s 110~ hours This is a *base* pretrained model and requires further fine tuning to be useful. ## Model Details | [openai/openai_humaneval](https://huggingface.co/datasets/openai/openai_humaneval) (greedy) | [mbpp](https://huggingface.co/datasets/google-research-datasets/mbpp) (greedy) | | :------------------ | :------------- | | 9.2% | 9.8% |