Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,99 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
tags:
|
6 |
+
- InstructGPT
|
7 |
+
- hf
|
8 |
+
---
|
9 |
+
|
10 |
+
|
11 |
+
|
12 |
+
# InstructPalmyra-20b
|
13 |
+
|
14 |
+
<style>
|
15 |
+
img {
|
16 |
+
display: inline;
|
17 |
+
}
|
18 |
+
</style>
|
19 |
+
|
20 |
+
|
21 |
+
## Model Description
|
22 |
+
|
23 |
+
Introducing InstructPalmyra-20b, a state-of-the-art instruction-following 20b language model designed to deliver exceptional performance and versatility. Derived from the foundational architecture of [Palmyra-20b](https://huggingface.co/Writer/palmyra-large), InstructPalmyra-20b is specifically tailored to address the growing demand for advanced natural language processing and comprehension capabilities.
|
24 |
+
|
25 |
+
The InstructPalmyra-20b model is meticulously trained on an extensive dataset of approximately 70,000 instruction-response records. These records are generated by our dedicated Writer Linguist team, who possess considerable expertise in language modeling and fine-tuning techniques. By leveraging their skills and knowledge, the InstructPalmyra-20b model is primed to offer unparalleled proficiency in understanding and executing language-based instructions.
|
26 |
+
|
27 |
+
One of the key differentiators of InstructPalmyra-20b lies in its ability to process complex instructions and generate accurate, contextually appropriate responses. This makes it an ideal choice for a wide range of applications, including virtual assistants, customer support, content generation, and more. Additionally, the model's comprehensive training enables it to adapt and perform well under varying conditions and contexts, further expanding its potential use cases.
|
28 |
+
|
29 |
+
|
30 |
+
|
31 |
+
## Usage :
|
32 |
+
```python
|
33 |
+
import torch
|
34 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
35 |
+
|
36 |
+
model_name = "Writer/InstructPalmyra-20b"
|
37 |
+
|
38 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
39 |
+
model = AutoModelForCausalLM.from_pretrained(
|
40 |
+
model_name,
|
41 |
+
device_map="auto",
|
42 |
+
torch_dtype=torch.float16
|
43 |
+
)
|
44 |
+
|
45 |
+
instruction = "Describe a futuristic device that revolutionizes space travel."
|
46 |
+
|
47 |
+
|
48 |
+
PROMPT_DICT = {
|
49 |
+
"prompt_input": (
|
50 |
+
"Below is an instruction that describes a task, paired with an input that provides further context. "
|
51 |
+
"Write a response that appropriately completes the request\n\n"
|
52 |
+
"### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:"
|
53 |
+
),
|
54 |
+
"prompt_no_input": (
|
55 |
+
"Below is an instruction that describes a task. "
|
56 |
+
"Write a response that appropriately completes the request.\n\n"
|
57 |
+
"### Instruction:\n{instruction}\n\n### Response:"
|
58 |
+
),
|
59 |
+
}
|
60 |
+
|
61 |
+
text = (
|
62 |
+
PROMPT_DICT["prompt_no_input"].format(instruction=instruction)
|
63 |
+
if not input
|
64 |
+
else PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input)
|
65 |
+
)
|
66 |
+
|
67 |
+
model_inputs = tokenizer(text, return_tensors="pt").to("cuda")
|
68 |
+
output_ids = model.generate(
|
69 |
+
**model_inputs,
|
70 |
+
max_length=256,
|
71 |
+
)
|
72 |
+
output_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
|
73 |
+
clean_output = output_text.split("### Response:")[1].strip()
|
74 |
+
|
75 |
+
print(clean_output)
|
76 |
+
```
|
77 |
+
|
78 |
+
|
79 |
+
### Limitations and Biases
|
80 |
+
|
81 |
+
InstructPalmyra's core functionality is to take a string of text and predict the next token. While language models are widely used for other tasks, there are many unknowns in this work. When prompting InstructPalmyra, keep in mind that the next statistically likely token is not always the token that produces the most "accurate" text. Never rely on InstructPalmyra to produce factually correct results.
|
82 |
+
|
83 |
+
InstructPalmyra was trained on Writer’s custom data. As with all language models, it is difficult to predict how InstructPalmyra will respond to specific prompts, and offensive content may appear unexpectedly. We recommend that the outputs be curated or filtered by humans before they are released, both to censor undesirable content and to improve the quality of the results.
|
84 |
+
|
85 |
+
|
86 |
+
## Citation and Related Information
|
87 |
+
|
88 |
+
|
89 |
+
To cite this model:
|
90 |
+
```
|
91 |
+
@misc{InstructPalmyra,
|
92 |
+
author = {Writer Engineering team},
|
93 |
+
title = {{InstructPalmyra-20b : Instruct tuned Palmyra-Large model}},
|
94 |
+
howpublished = {\url{https://dev.writer.com}},
|
95 |
+
year = 2023,
|
96 |
+
month = July
|
97 |
+
}
|
98 |
+
```
|
99 |
+
[![Model architecture](https://img.shields.io/badge/Model%20Arch-Transformer%20Decoder-green)](#model-architecture)|[![Model size](https://img.shields.io/badge/Params-20B-green)](#model-architecture)|[![Language](https://img.shields.io/badge/Language-en--US-lightgrey#model-badge)](#datasets)|![AUR license](https://img.shields.io/badge/license-Apache%202-blue)
|