Spaces:

qgyd2021
/

qwen_7b_chinese_modern_poetry

Runtime error

App Files Files Community

qgyd2021 commited on Sep 19, 2023

Commit

6a8e34d

•

1 Parent(s): c865805

[update]edit main

Browse files

Files changed (1) hide show

main.py +23 -112

main.py CHANGED Viewed

@@ -34,114 +34,19 @@ def get_args():
 description = """
-## ChatGLM-6B
-基于 [firefly-chatglm2-6b](https://huggingface.co/YeungNLP/firefly-chatglm2-6b) 模型, 在 [telemarketing_intent](https://huggingface.co/datasets/qgyd2021/telemarketing_intent/tree/main/data/prompt) 的 prompt 数据集上训练, 目的是实现 `电话营销` 场景的 1-shot 意图识别.
-该分类任务有一百多个类别, 但标注数据总是只有 3 万, 并且有一半是 "无关领域", 实现思路是:
-1. 首先采用传统算法做硬分类, 然后提取概率 top 10 的标签.
-2. 将 top 10 的标签作为候选标签, 并为每个标签提供一个句子示例.
-3. 要求 LLM 输出目标句子的类别.
-Gradio 布署代码参考了: https://huggingface.co/spaces/aodianyun/ChatGLM-6B
 """
 examples = [
-    """我们在做电话营销场景的意图识别任务, 可选的意图如下:
-否定(不是); 礼貌用语; 否定答复; 肯定(需要); 用户正忙; 否定(不需要); 无关领域; 否定(没有); 否定(不用了); 价格太高
-如果你认为给定的句子不属于这些意图中的任务一个, 你可以回答: 不知道.
-Tips:
-1. 如果候选意图中有 "无关领域", 当你不知道时, 则它有可能属于无关领域.
-Examples:
----------
-ExampleSentence: 其实不是
-ExampleIntent: 否定(不是)
-ExampleSentence: 嗯!嘿嘿!早点休息,晚安咯
-ExampleIntent: 礼貌用语
-ExampleSentence: 没问诶
-ExampleIntent: 否定答复
-ExampleSentence: 不好意思都需要谢谢
-ExampleIntent: 肯定(需要)
-ExampleSentence: 对呀我在忙
-ExampleIntent: 用户正忙
-ExampleSentence: 。嗯也也不需要吧唉呀现在不需要那个啊嗯
-ExampleIntent: 否定(不需要)
-ExampleSentence: 我的处理器需要很少的电源。
-ExampleIntent: 无关领域
-ExampleSentence: 。呃我好像没有在太平洋买过保险，吧拜拜
-ExampleIntent: 否定(没有)
-ExampleSentence: 嗯不用谢谢
-ExampleIntent: 否定(不用了)
-ExampleSentence: 费用贵。
-ExampleIntent: 价格太高
----------
-Sentence: 。嗯各位不需要，啊谢谢
-Intent:""",
-    """我们在做电话营销场景的意图识别任务, 可选的意图如下:
-语音信箱; 无关领域; 查物品信息; 污言秽语; 疑问(时间); 疑问(数值); 答时间; 查收费方式; 价格太高; 答数值
-如果你认为给定的句子不属于这些意图中的任务一个, 你可以回答: 不知道.
-Tips:
-1. 如果候选意图中有 "无关领域", 当你不知道时, 则它有可能属于无关领域.
-Examples:
----------
-ExampleSentence: 我们留言。
-ExampleIntent: 语音信箱
-ExampleSentence: 很刚刚打
-ExampleIntent: 无关领域
-ExampleSentence: 什么东西我听
-ExampleIntent: 查物品信息
-ExampleSentence: 知道!AV女优!日本人的骄傲!
-ExampleIntent: 污言秽语
-ExampleSentence: 最后期限
-ExampleIntent: 疑问(时间)
-ExampleSentence: 一共借了多少钱
-ExampleIntent: 疑问(数值)
-ExampleSentence: 22号
-ExampleIntent: 答时间
-ExampleSentence: 运费
-ExampleIntent: 查收费方式
-ExampleSentence: 利息高
-ExampleIntent: 价格太高
-ExampleSentence: 20。
-ExampleIntent: 答数值
----------
-Sentence: 。对啊什么东西啊我6月份出来的
-Intent:"""
 ]
@@ -175,20 +80,26 @@ def main():
         )
     model = model.eval()
-    def fn(inputs, history=None):
-        if history is None:
-            history = list()
         with torch.no_grad():
-            response, history = model.chat(tokenizer, inputs, history)
-        return history, history
     with gr.Blocks() as blocks:
         gr.Markdown(value=description)
-        state = gr.State([])
         chatbot = gr.Chatbot([], elem_id="chatbot").style(height=400)
         with gr.Row():
             with gr.Column(scale=4):
@@ -198,8 +109,8 @@ def main():
         gr.Examples(examples, text)
-        text.submit(fn, [text, state], [chatbot, state])
-        button.click(fn, [text, state], [chatbot, state])
     blocks.queue().launch()

 description = """
+## Qwen-7B
+基于 [Qwen-7B](https://huggingface.co/qgyd2021/Qwen-7B) 模型, 在 [chinese_modern_poetry](https://huggingface.co/datasets/Iess/chinese_modern_poetry) 的 prompt 数据集上训练.
+可用于生成现代诗. 如下:
+使用下列意象写一首现代诗：智慧，刀刃
 """
 examples = [
+    "使用下列意象写一首现代诗：石头，森林",
+    "使用下列意象写一首现代诗：花，纱布"
 ]
         )
     model = model.eval()
+    def fn(inputs: str):
+        input_ids = tokenizer(
+            inputs,
+            return_tensors="pt",
+            add_special_tokens=False,
+        ).input_ids.to(args.device)
         with torch.no_grad():
+            outputs = model.generate(
+                input_ids=input_ids, max_new_tokens=args.max_new_tokens, do_sample=True,
+                top_p=args.top_p, temperature=args.temperature, repetition_penalty=args.repetition_penalty,
+                eos_token_id=tokenizer.eos_token_id
+            )
+            outputs = outputs.tolist()[0][len(input_ids[0]):]
+            response = tokenizer.decode(outputs)
+            response = response.strip().replace(tokenizer.eos_token, "").strip()
+        return response
     with gr.Blocks() as blocks:
         gr.Markdown(value=description)
         chatbot = gr.Chatbot([], elem_id="chatbot").style(height=400)
         with gr.Row():
             with gr.Column(scale=4):
         gr.Examples(examples, text)
+        text.submit(fn, [text], [chatbot])
+        button.click(fn, [text], [chatbot])
     blocks.queue().launch()