OpenVINO
/

TinyLlama-1.1B-Chat-v1.0-int8-ov

@@ -2,23 +2,19 @@
 license: apache-2.0
 ---
-<!-- Model name used as model card title -->
 # TinyLlama-1.1B-Chat-v1.0-int8-ov
-<!-- Original model reference -->
  * Model creator: [TinyLlama](https://huggingface.co/TinyLlama)
  * Original model: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
-<!-- Description of converted model -->
 ## Description
-<!-- Comment and reference on NNCF applicable only for INT8 and INT4 models -->
 This is [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
 ## Quantization Parameters
 Weight compression was performed using `nncf.compress_weights` with the following parameters:
 * mode: **INT8_ASYM**
 * ratio: **1.0**
@@ -33,8 +29,6 @@ The provided OpenVINO™ IR model is compatible with:
 ## Running Model Inference
-<!-- Example model usage -->
 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
 ```
@@ -43,8 +37,6 @@ pip install optimum[openvino]
 2. Run model inference:
-<!-- Usage example can be adopted from original model usage example -->
 ```
 from transformers import AutoTokenizer
 from optimum.intel.openvino import OVModelForCausalLM
@@ -64,7 +56,6 @@ For more examples and possible optimizations, refer to the [OpenVINO Large Langu
 ## Legal information
-<!-- Note about original model license -->
 The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
 ## Disclaimer

 license: apache-2.0
 ---
 # TinyLlama-1.1B-Chat-v1.0-int8-ov
  * Model creator: [TinyLlama](https://huggingface.co/TinyLlama)
  * Original model: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
 ## Description
 This is [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
 ## Quantization Parameters
 Weight compression was performed using `nncf.compress_weights` with the following parameters:
 * mode: **INT8_ASYM**
 * ratio: **1.0**
 ## Running Model Inference
 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
 ```
 2. Run model inference:
 ```
 from transformers import AutoTokenizer
 from optimum.intel.openvino import OVModelForCausalLM
 ## Legal information
 The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
 ## Disclaimer