|
--- |
|
license: other |
|
license_name: llama-3 |
|
license_link: https://llama.meta.com/llama3/license/ |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
datasets: |
|
- Replete-AI/code_bagel_hermes-2.5 |
|
- Replete-AI/code_bagel |
|
- Replete-AI/OpenHermes-2.5-Uncensored |
|
- teknium/OpenHermes-2.5 |
|
- layoric/tiny-codes-alpaca |
|
- glaiveai/glaive-code-assistant-v3 |
|
- ajibawa-2023/Code-290k-ShareGPT |
|
- TIGER-Lab/MathInstruct |
|
- chargoddard/commitpack-ft-instruct-rated |
|
- iamturun/code_instructions_120k_alpaca |
|
- ise-uiuc/Magicoder-Evol-Instruct-110K |
|
- cognitivecomputations/dolphin-coder |
|
- nickrosh/Evol-Instruct-Code-80k-v1 |
|
- coseal/CodeUltraFeedback_binarized |
|
- glaiveai/glaive-function-calling-v2 |
|
- CyberNative/Code_Vulnerability_Security_DPO |
|
- jondurbin/airoboros-2.2 |
|
- camel-ai |
|
- lmsys/lmsys-chat-1m |
|
- CollectiveCognition/chats-data-2023-09-22 |
|
- CoT-Alpaca-GPT4 |
|
- WizardLM/WizardLM_evol_instruct_70k |
|
- WizardLM/WizardLM_evol_instruct_V2_196k |
|
- teknium/GPT4-LLM-Cleaned |
|
- GPTeacher |
|
- OpenGPT |
|
- meta-math/MetaMathQA |
|
- Open-Orca/SlimOrca |
|
- garage-bAInd/Open-Platypus |
|
- anon8231489123/ShareGPT_Vicuna_unfiltered |
|
- Unnatural-Instructions-GPT4 |
|
model-index: |
|
- name: Replete-Coder-llama3-8b |
|
results: |
|
- task: |
|
name: HumanEval |
|
type: text-generation |
|
dataset: |
|
type: openai_humaneval |
|
name: HumanEval |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: .64683835842678326 |
|
verified: True |
|
- task: |
|
name: AI2 Reasoning Challenge |
|
type: text-generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: normalized accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: normalized accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: multiple_choice_accuracy |
|
value: |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
- task: |
|
name: Text Generation |
|
type: text-generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: accuracy |
|
value: |
|
name: accuracy |
|
source: |
|
url: https://www.placeholderurl.com |
|
name: Open LLM Leaderboard |
|
--- |
|
# Replete-Coder-llama3-8b |
|
Finetuned by: Rombodawg |
|
### More than just a coding model! |
|
Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer! |
|
![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F642cc1c253e76b4c2286c58e%2F-0dERC793D9XeFsJ9uHbx.png%3C%2Fspan%3E)%3C!-- HTML_TAG_END --> |
|
|
|
Thank you to TensorDock for sponsoring Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b |
|
you can check out their website for cloud compute rental below. |
|
- https://tensordock.com |
|
__________________________________________________________________________________________________ |
|
Replete-Coder-llama3-8b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened. |
|
|
|
The Replete-Coder models (including Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b) feature the following: |
|
|
|
- Advanced coding capabilities in over 100 coding languages |
|
- Advanced code translation (between languages) |
|
- Security and vulnerability prevention related coding capabilities |
|
- General purpose use |
|
- Uncensored use |
|
- Function calling |
|
- Advanced math use |
|
- Use on low end (8b) and mobile (1.5b) platforms |
|
|
|
Notice: Replete-Coder series of models are fine-tuned on a context window of 8192 tokens. Performance past this context window is not guaranteed. |
|
|
|
![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F642cc1c253e76b4c2286c58e%2FC-zxpY5n8KuzQeocmhk0g.png%3C%2Fspan%3E)%3C!-- HTML_TAG_END --> |
|
__________________________________________________________________________________________________ |
|
You can find the 25% non-coding instruction below: |
|
|
|
- https://huggingface.co/datasets/Replete-AI/OpenHermes-2.5-Uncensored |
|
|
|
And the 75% coding specific instruction data below: |
|
|
|
- https://huggingface.co/datasets/Replete-AI/code_bagel |
|
|
|
These two datasets were combined to create the final dataset for training, which is linked below: |
|
|
|
- https://huggingface.co/datasets/Replete-AI/code_bagel_hermes-2.5 |
|
__________________________________________________________________________________________________ |
|
## Prompt Template: Custom Alpaca |
|
``` |
|
### System: |
|
{} |
|
|
|
### Instruction: |
|
{} |
|
|
|
### Response: |
|
{} |
|
``` |
|
Note: The system prompt varies in training data, but the most commonly used one is: |
|
``` |
|
Below is an instruction that describes a task, Write a response that appropriately completes the request. |
|
``` |
|
End token: |
|
``` |
|
<|endoftext|> |
|
``` |
|
__________________________________________________________________________________________________ |
|
Thank you to the community for your contributions to the Replete-AI/code_bagel_hermes-2.5 dataset. Without the participation of so many members making their datasets free and open source for any to use, this amazing AI model wouldn't be possible. |
|
|
|
Extra special thanks to Teknium for the Open-Hermes-2.5 dataset and jondurbin for the bagel dataset and the naming idea for the code_bagel series of datasets. You can find both of their huggingface accounts linked below: |
|
|
|
- https://huggingface.co/teknium |
|
- https://huggingface.co/jondurbin |
|
|
|
Another special thanks to unsloth for being the main method of training for Replete-Coder. Bellow you can find their github, as well as the special Replete-Ai secret sause (Unsloth + Qlora + Galore) colab code document that was used to train this model. |
|
|
|
- https://github.com/unslothai/unsloth |
|
- https://colab.research.google.com/drive/1VAaxMQJN9-78WLsPU0GWg5tEkasXoTP9?usp=sharing |
|
__________________________________________________________________________________________________ |
|
## Join the Replete-Ai discord! We are a great and Loving community! |
|
|
|
- https://discord.gg/ZZbnsmVnjD |