tim001's picture
Update README.md
41ab872 verified
metadata
language: en
license: apache-2.0
datasets:
  - squad
  - whiteDandelion/QA-Dataset-Financial-Information
model-index:
  - name: distilbert-base-cased-distilled-squad
    results:
      - task:
          type: question-answering
          name: Question Answering
        dataset:
          name: finacial question answer data + squad
          type: squad
          config: plain_text
          split: validation
        metrics:
          - type: exact_match
            value: 79.5998
            name: Exact Match
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTViZDA2Y2E2NjUyMjNjYjkzNTUzODc5OTk2OTNkYjQxMDRmMDhlYjdmYWJjYWQ2N2RlNzY1YmI3OWY1NmRhOSIsInZlcnNpb24iOjF9.ZJHhboAMwsi3pqU-B-XKRCYP_tzpCRb8pEjGr2Oc-TteZeoWHI8CXcpDxugfC3f7d_oBcKWLzh3CClQxBW1iAQ
          - type: f1
            value: 86.9965
            name: F1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZWZlMzY2MmE1NDNhOGNjNWRmODg0YjQ2Zjk5MjUzZDQ2MDYxOTBlMTNhNzQ4NTA2NjRmNDU3MGIzMTYwMmUyOSIsInZlcnNpb24iOjF9.z0ZDir87aT7UEmUeDm8Uw0oUdAqzlBz343gwnsQP3YLfGsaHe-jGlhco0Z7ISUd9NokyCiJCRc4NNxJQ83IuCw

How to Get Started with the distilbert-base-cased-distilled-finqa

Here is how to use this model in PyTorch:

from transformers import DistilBertTokenizer, DistilBertModel
import torch
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased-distilled-finqa')
model = DistilBertModel.from_pretrained('distilbert-base-cased-distilled-finqa')

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

print(outputs)

And in TensorFlow:

from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering
import tensorflow as tf

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-cased-distilled-finqa")
model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-cased-distilled-finqa")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="tf")
outputs = model(**inputs)

answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)