tsmatz
/

roberta_qa_japanese

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

tsmatz commited on Dec 11, 2022

Commit

e8f6a19

·

1 Parent(s): 22f36b7

Update README.md

Files changed (1) hide show

README.md +33 -11

README.md CHANGED Viewed

@@ -29,14 +29,15 @@ should probably proofread and complete it, then remove this comment. -->
 (Japanese caption : 日本語の (抽出型) 質問応答のモデル)
-This model is a fine-tuned question-answering model of [rinna/japanese-roberta-base](https://huggingface.co/rinna/japanese-roberta-base) on [JaQuAD](https://huggingface.co/datasets/SkelterLabsInc/JaQuAD) dataset.
 ## Intended uses
 When running with a dedicated pipeline :
 ```python
-from transformers import AutoModelForTokenClassification
 from transformers import pipeline
 model_name = "tsmatz/roberta_qa_japanese"
@@ -52,17 +53,38 @@ result = qa_pipeline(
 print(result)
 ```
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure

 (Japanese caption : 日本語の (抽出型) 質問応答のモデル)
+This model is a fine-tuned version of [rinna/japanese-roberta-base](https://huggingface.co/rinna/japanese-roberta-base) trained for extractive question answering.
+The model is fine-tuned on [JaQuAD](https://huggingface.co/datasets/SkelterLabsInc/JaQuAD) dataset provided by Skelter Labs, in which data is collected from Japanese Wikipedia articles and annotated by a human.
 ## Intended uses
 When running with a dedicated pipeline :
 ```python
 from transformers import pipeline
 model_name = "tsmatz/roberta_qa_japanese"
 print(result)
 ```
+When manually running through forward pass :
+```python
+import torch
+import numpy as np
+from transformers import AutoModelForQuestionAnswering, AutoTokenizer
+model_name = "tsmatz/roberta_qa_japanese"
+model = (AutoModelForQuestionAnswering
+         .from_pretrained(model_name))
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+def inference_answer(question, context):
+    question = question
+    context = context
+    test_feature = tokenizer(
+        question,
+        context,
+        max_length=318,
+    )
+    with torch.no_grad():
+        outputs = model(torch.tensor([test_feature["input_ids"]]))
+    start_logits = outputs.start_logits.cpu().numpy()
+    end_logits = outputs.end_logits.cpu().numpy()
+    answer_ids = test_feature["input_ids"][np.argmax(start_logits):np.argmax(end_logits)+1]
+    return "".join(tokenizer.batch_decode(answer_ids))
+question = "決勝トーナメントで日本に勝ったのはどこでしたか。"
+context = "日本は予選リーグで強豪のドイツとスペインに勝って決勝トーナメントに進んだが、クロアチアと対戦して敗れた。"
+answer_pred = inference_answer(question, context)
+print(answer_pred)
+```
 ## Training procedure