--- base_model: Qwen/Qwen2.5-0.5B-Instruct language: - en library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct/blob/main/LICENSE pipeline_tag: text-generation tags: - chat - openvino - openvino-export --- This model was converted to OpenVINO from [`Qwen/Qwen2.5-0.5B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) using [optimum-intel](https://github.com/huggingface/optimum-intel) via the [export](https://huggingface.co/spaces/echarlaix/openvino-export) space. First make sure you have optimum-intel installed: ```bash pip install optimum[openvino] ``` To load your model you can do as follows: In huggingface space app.py ```python import gradio as gr from huggingface_hub import InferenceClient from optimum.intel import OVModelForCausalLM from transformers import AutoTokenizer, pipeline # 載入模型和標記器 model_id = "HelloSun/Qwen2.5-0.5B-Instruct-openvino" model = OVModelForCausalLM.from_pretrained(model_id) tokenizer = AutoTokenizer.from_pretrained(model_id) # 建立生成管道 pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) def respond(message, history): # 將當前訊息與歷史訊息合併 #input_text = message if not history else history[-1]["content"] + " " + message input_text = message # 獲取模型的回應 response = pipe(input_text, max_length=500, truncation=True, num_return_sequences=1) reply = response[0]['generated_text'] # 返回新的消息格式 print(f"Message: {message}") print(f"Reply: {reply}") return reply # 設定 Gradio 的聊天界面 demo = gr.ChatInterface(fn=respond, title="Chat with Qwen(通義千問) 2.5-0.5B", description="與 HelloSun/Qwen2.5-0.5B-Instruct-openvino 聊天!", type='messages') if __name__ == "__main__": demo.launch() ``` requirements.txt ```requirements.txt huggingface_hub==0.25.2 optimum[openvino] ```