rishiraj
/

CatPPT

@@ -24,30 +24,57 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # 😼 CatPPT
-Introducing "CatPPT" - the purrfect alternative to that other big cat in town, known for keeping all the secrets to itself! Our feline friend here is a Large Language Model like no other, created through the magical process of merging openchat and neuralchat models using the enchanting Gradient SLERP method.
-This whiskered wonder boasts being the top-performing 7B model on the block, free from any whiff of evaluation data contamination. So go ahead, let your curiosity run wild, and engage with this independent, open-source kitty who's ready to pounce on all your language processing needs. Just remember, there's no need to feel left out in the cold when you have CatPPT warming up your cozy corner of the internet!
-This model is a fine-tuned version of [rishiraj/CatPPT-base](https://huggingface.co/rishiraj/CatPPT-base) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 2.0093
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -78,10 +105,17 @@ The following hyperparameters were used during training:
 - Pytorch 2.1.2+cu121
 - Datasets 2.14.6
 - Tokenizers 0.15.0
-## Training procedure
-### Framework versions
 - PEFT 0.6.1

   results: []
 ---
 # 😼 CatPPT
+Introducing "CatPPT" - the purrfect alternative to that other big cat in town, known for keeping all the secrets to itself! Our feline friend here is created through merging openchat and neuralchat models using Gradient SLERP method (resulting in [rishiraj/CatPPT-base](https://huggingface.co/rishiraj/CatPPT-base)) and then finetuned on no_robots dataset for chat.
+This is the top-performing 7B model on the leaderboard, that's free from any whiff of evaluation data contamination.
+## Model date
+rishiraj/CatPPT was trained between 15th and 17th December, 2023.
+## Evaluation
+It achieves the following results on the [Open_LLM_Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard). At the time of release, CatPPT is the highest ranked 7B chat model on the leaderboard, that's free from evaluation data contamination.
+|Model                               |Average|ARC  |HellaSwag|MMLU |TruthfulQA|Winogrande|GSM8K|
+|------------------------------------|-------|-----|---------|-----|----------|----------|-----|
+|**rishiraj/CatPPT**                     |**72.32**  |**68.09**|**86.69**    |**65.16**|**61.55**     |**81.61**     |**70.81**|
+|Intel/neural-chat-7b-v3-3           |69.83  |66.89|85.26    |63.07|63.01     |79.64     |61.11|
+|openchat/openchat-3.5-1210          |68.89  |64.93|84.92    |64.62|52.15     |80.74     |65.96|
+|meta-math/MetaMath-Mistral-7B       |65.78  |60.67|82.58    |61.95|44.89     |75.77     |68.84|
+|Deci/DeciLM-7B-instruct             |63.19  |61.01|82.37    |60.24|49.75     |79.72     |46.02|
+|mistralai/Mistral-7B-Instruct-v0.2  |65.71  |63.14|84.88    |60.78|68.26     |77.19     |40.03|
+|mistralai/Mixtral-8x7B-Instruct-v0.1|72.62  |70.22|87.63    |71.16|64.58     |81.37     |60.73|
+|meta-llama/Llama-2-70b-hf           |67.87  |67.32|87.33    |69.83|44.92     |83.74     |54.06|
+|tiiuae/falcon-180B                  |67.85  |69.45|88.86    |70.5 |45.47     |86.9      |45.94|
+## Inference procedure
+Here's how you can run the model using the pipeline() function from 🤗 Transformers:
+```
+import torch
+from transformers import pipeline
+pipe = pipeline("text-generation", model="rishiraj/CatPPT", torch_dtype=torch.bfloat16, device_map="auto")
+# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
+messages = [
+    {
+        "role": "system",
+        "content": "You are a friendly chatbot who always responds in the style of a pirate"
+    },
+    {
+        "role": "user",
+        "content": "How many helicopters can a human eat in one sitting?"
+    }
+]
+prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+print(outputs[0]["generated_text"])
+```
 ## Training procedure
 - Pytorch 2.1.2+cu121
 - Datasets 2.14.6
 - Tokenizers 0.15.0
 - PEFT 0.6.1
+## Citation Information
+```
+@misc{rishiraj2023catppt,
+  author = {Rishiraj Acharya},
+  title = {CatPPT},
+  year = {2023},
+  publisher = {Hugging Face},
+  journal = {Hugging Face repository},
+  howpublished = {\url{https://huggingface.co/rishiraj/CatPPT}}
+}
+```