OpenVINO
/

mistral-7b-instruct-v0.1-int4-ov

Text Generation

Inference Endpoints

Model card Files Files and versions Community

katuni4ka commited on Jul 5

Commit

0a66857

•

1 Parent(s): b65c40a

Update README.md

Files changed (1) hide show

README.md +5 -12

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ language:
 - en
 ---
-# Mistral-7b-Instruct-v0.1-int8-ov
  * Model creator: [Mistral AI](https://huggingface.co/mistralai)
  * Original model: [Mistral-7b-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
@@ -25,10 +25,10 @@ For more information on quantization, check the [OpenVINO model optimization gui
 The provided OpenVINO™ IR model is compatible with:
-* OpenVINO version 2024.1.0 and higher
 * Optimum Intel 1.16.0 and higher
-## Running Model Inference
 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
@@ -42,18 +42,11 @@ pip install optimum[openvino]
 from transformers import AutoTokenizer
 from optimum.intel.openvino import OVModelForCausalLM
-model_id = "OpenVINO/mistral-7b-instrcut-v0.1-int8-ov"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = OVModelForCausalLM.from_pretrained(model_id)
-messages = [
-    {"role": "user", "content": "What is your favourite condiment?"},
-    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
-    {"role": "user", "content": "Do you have mayonnaise recipes?"}
-]
-inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
 outputs = model.generate(inputs, max_new_tokens=20)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 - en
 ---
+# Mistral-7b-Instruct-v0.1-int4-ov
  * Model creator: [Mistral AI](https://huggingface.co/mistralai)
  * Original model: [Mistral-7b-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
 The provided OpenVINO™ IR model is compatible with:
+* OpenVINO version 2024.2.0 and higher
 * Optimum Intel 1.16.0 and higher
+## Running Model Inference with [Optimum Intel](https://huggingface.co/docs/optimum/intel/index)
 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
 from transformers import AutoTokenizer
 from optimum.intel.openvino import OVModelForCausalLM
+model_id = "OpenVINO/mistral-7b-instrcut-v0.1-int4-ov"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = OVModelForCausalLM.from_pretrained(model_id)
+inputs = tokenizer("What is OpenVINO?", return_tensors="pt")
 outputs = model.generate(inputs, max_new_tokens=20)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))