TinyLlama-1.1B ---My personal Test update Version 2

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge Yaml none 0 acc 0.2790 ± 0.0131
none 0 acc_norm 0.3234 ± 0.0137
arc_easy Yaml none 0 acc 0.6006 ± 0.0101
none 0 acc_norm 0.5770 ± 0.0101
boolq Yaml none 0 acc 0.6373 ± 0.0084
hellaswag Yaml none 0 acc 0.4521 ± 0.0050
none 0 acc_norm 0.5822 ± 0.0049
openbookqa Yaml none 0 acc 0.2220 ± 0.0186
none 0 acc_norm 0.3740 ± 0.0217
piqa Yaml none 0 acc 0.7269 ± 0.0104
none 0 acc_norm 0.7296 ± 0.0104
winogrande Yaml none 0 acc 0.5754 ± 0.0139

https://github.com/jzhang38/TinyLlama

The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.

We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.

This Model

This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T. We follow HF's Zephyr's training recipe. The model was " initially fine-tuned on a variant of the UltraChat dataset, which contains a diverse range of synthetic dialogues generated by ChatGPT. We then further aligned the model with 🤗 TRL's DPOTrainer on the openbmb/UltraFeedback dataset, which contain 64k prompts and model completions that are ranked by GPT-4."

How to use

You will need the transformers>=4.34 Do check the TinyLlama github page for more information.

# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto")
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
# <|system|>
# You are a friendly chatbot who always responds in the style of a pirate.</s>
# <|user|>
# How many helicopters can a human eat in one sitting?</s>
# <|assistant|>
# ...
Downloads last month
735
Safetensors
Model size
1.1B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Deathsquad10/TinyLlama-1.1B-Remix-V.2

Quantizations
1 model

Datasets used to train Deathsquad10/TinyLlama-1.1B-Remix-V.2