ctranslate2-4you commited on
Commit
e24a7e3
·
verified ·
1 Parent(s): 65cfed6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: ctranslate2
3
+ license: apache-2.0
4
+ base_model:
5
+ - internlm/internlm3-8b-instruct
6
+ base_model_relation: quantized
7
+ tags:
8
+ - ctranslate2
9
+ - internlm3
10
+ - chat
11
+ ---
12
+
13
+ ### Ctranslate2 conversion of InternLM3 - 8b into "AWQ 4-bit format"
14
+ 1) First converted to AWQ format using the [cosmopedia-100k dataset](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia-100k) for calibration.
15
+ 2) Converted to Ctranslate2-compatible format afterwards
16
+
17
+ [Original model here](https://huggingface.co/internlm/internlm3-8b-instruct)
18
+
19
+ # Example Usage
20
+
21
+ <details><summary>Non-Streaming Example:</summary>
22
+
23
+ ```python
24
+ import ctranslate2
25
+ from transformers import AutoTokenizer
26
+
27
+ def generate_response(prompt: str, system_message: str, model_path: str) -> str:
28
+ generator = ctranslate2.Generator(
29
+ model_path,
30
+ device="cuda",
31
+ )
32
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
33
+ formatted_prompt = f"""<s><|im_start|>system
34
+ {system_message}<|im_end|>
35
+ <|im_start|>user
36
+ {user_message}<|im_end|>
37
+ <|im_start|>assistant
38
+ """
39
+ tokens = tokenizer.tokenize(formatted_prompt, trust_remote_code=True)
40
+ results = generator.generate_batch(
41
+ [tokens],
42
+ max_length=1024,
43
+ sampling_temperature=0.7,
44
+ include_prompt_in_result=False,
45
+ end_token="<|im_end|>",
46
+ return_end_token=False,
47
+ )
48
+ response = tokenizer.decode(results[0].sequences_ids[0], skip_special_tokens=True)
49
+ return response
50
+
51
+ if __name__ == "__main__":
52
+ model_path = "path/to/your/phi-4-ct2-model"
53
+ system_message = "You are a helpful AI assistant."
54
+ user_prompt = "Write a short poem about a cat."
55
+ response = generate_response(user_prompt, system_message, model_path)
56
+ print("\nGenerated response:")
57
+ print(response)
58
+
59
+ ```
60
+ </details>