vince62s commited on
Commit
72818cc
·
1 Parent(s): 31d56ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -23,15 +23,15 @@ Boston is a great city with many attractions to visit. Here are some popular one
23
  If you run with a batch size of 60 you can get a nice throughput even with GEMV:
24
 
25
  ```
26
- [2023-12-20 08:41:50,556 INFO] Loading checkpoint from /mnt/InternalCrucial4/dataAI/mistral-7B/mistral-instruct/mistral-onmt-awq.pt
27
- [2023-12-20 08:41:50,647 INFO] aawq_gemv compression of layer ['w_1', 'w_2', 'w_3', 'linear_values', 'linear_query', 'linear_keys', 'final_linear']
28
- [2023-12-20 08:41:54,655 INFO] Loading data into the model
29
- step0 time: 1.2817533016204834
30
- [2023-12-20 08:42:01,746 INFO] PRED SCORE: -0.2969, PRED PPL: 1.35 NB SENTENCES: 59
31
- [2023-12-20 08:42:01,746 INFO] Total translation time (s): 6.1
32
- [2023-12-20 08:42:01,746 INFO] Average translation time (ms): 104.2
33
- [2023-12-20 08:42:01,746 INFO] Tokens per second: 1923.2
34
- Time w/o python interpreter load/terminate: 11.200659036636353
35
  ```
36
 
37
 
 
23
  If you run with a batch size of 60 you can get a nice throughput even with GEMV:
24
 
25
  ```
26
+ [2023-12-27 14:54:47,967 INFO] Loading checkpoint from /mnt/InternalCrucial4/dataAI/mistral-7B/mistral-instruct-v0.2/mistral-instruct-v0.2-onmt-awq-gemv.pt
27
+ [2023-12-27 14:54:48,063 INFO] awq_gemv compression of layer ['w_1', 'w_2', 'w_3', 'linear_values', 'linear_query', 'linear_keys', 'final_linear']
28
+ [2023-12-27 14:54:52,059 INFO] Loading data into the model
29
+ step0 time: 1.2714881896972656
30
+ [2023-12-27 14:54:59,180 INFO] PRED SCORE: -0.2316, PRED PPL: 1.26 NB SENTENCES: 59
31
+ [2023-12-27 14:54:59,180 INFO] Total translation time (s): 6.1
32
+ [2023-12-27 14:54:59,180 INFO] Average translation time (ms): 103.5
33
+ [2023-12-27 14:54:59,180 INFO] Tokens per second: 2183.8
34
+ Time w/o python interpreter load/terminate: 11.222625255584717
35
  ```
36
 
37