helenai
/

ibm-granite-granite-8b-code-instruct-ov

Text Generation

Inference Endpoints

Model card Files Files and versions Community

helenai commited on May 23

Commit

4d562b1

•

1 Parent(s): 1501889

Update README.md

Files changed (1) hide show

README.md +19 -8

README.md CHANGED Viewed

@@ -11,15 +11,26 @@ This is the [ibm-granite/granite-8b-code-instruct](https://huggingface.co/ibm-gr
 An example of how to do inference on this model:
 ```python
 from optimum.intel import OVModelForCausalLM
-from transformers import AutoTokenizer, pipeline
-# model_id should be set to either a local directory or a model available on the HuggingFace hub.
-model_id = "helenai/ibm-granite-granite-8b-code-instruct-ov"
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = OVModelForCausalLM.from_pretrained(model_id)
-pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
-result = pipe("hello world")
-print(result)
 ```

 An example of how to do inference on this model:
 ```python
+from transformers import AutoTokenizer
 from optimum.intel import OVModelForCausalLM
+model_path = "helenai/ibm-granite-granite-8b-code-instruct-ov"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = OVModelForCausalLM.from_pretrained(model_path)
+# change input text as desired
+chat = [
+    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
+]
+chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
+# tokenize the text
+input_tokens = tokenizer(chat, return_tensors="pt")
+# generate output tokens
+output = model.generate(**input_tokens, max_new_tokens=100)
+# decode output tokens into text
+output = tokenizer.batch_decode(output)
+# loop over the batch to print, in this example the batch size is 1
+for i in output:
+    print(i)
 ```